2020-05-09 07:21:57

by syzbot

[permalink] [raw]
Subject: WARNING in memtype_reserve

Hello,

syzbot found the following crash on:

HEAD commit: d5eeab8d Merge tag 'scsi-fixes' of git://git.kernel.org/pu..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15093632100000
kernel config: https://syzkaller.appspot.com/x/.config?x=b0212dbee046bc1f
dashboard link: https://syzkaller.appspot.com/bug?extid=353be47c9ce21b68b7ed
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
userspace arch: i386
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=168ee02c100000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=119f3788100000

The bug was bisected to:

commit 2bef9aed6f0e22391c8d4570749b1acc9bc3981e
Author: Jeremy Linton <[email protected]>
Date: Mon May 4 20:13:48 2020 +0000

usb: usbfs: correct kernel->user page attribute mismatch

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1701f168100000
final crash: https://syzkaller.appspot.com/x/report.txt?x=1481f168100000
console output: https://syzkaller.appspot.com/x/log.txt?x=1081f168100000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: [email protected]
Fixes: 2bef9aed6f0e ("usb: usbfs: correct kernel->user page attribute mismatch")

------------[ cut here ]------------
memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
WARNING: CPU: 1 PID: 7025 at arch/x86/mm/pat/memtype.c:589 memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 7025 Comm: syz-executor254 Not tainted 5.7.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x188/0x20d lib/dump_stack.c:118
panic+0x2e3/0x75c kernel/panic.c:221
__warn.cold+0x2f/0x35 kernel/panic.c:582
report_bug+0x27b/0x2f0 lib/bug.c:195
fixup_bug arch/x86/kernel/traps.c:175 [inline]
fixup_bug arch/x86/kernel/traps.c:170 [inline]
do_error_trap+0x12b/0x220 arch/x86/kernel/traps.c:267
do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:286
invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
RIP: 0010:memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
Code: 48 8b 2c ed c0 00 29 88 e8 ae ad 3e 00 48 8d 4b ff 49 89 e8 4c 89 e2 48 c7 c6 20 01 29 88 48 c7 c7 80 f9 28 88 e8 79 e8 0f 00 <0f> 0b 41 bf ea ff ff ff e9 03 fc ff ff 41 bf ea ff ff ff e9 f8 fb
RSP: 0018:ffffc900015e7790 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000009000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff815ce181 RDI: fffff520002bcee4
RBP: ffffffff8828ff40 R08: ffff888097ce85c0 R09: ffffed1015ce45f1
R10: ffff8880ae722f83 R11: ffffed1015ce45f0 R12: 000ffffffffff000
R13: 1ffff920002bcef8 R14: dffffc0000000000 R15: 0000000000000000
reserve_pfn_range+0x173/0x470 arch/x86/mm/pat/memtype.c:941
track_pfn_remap+0x18b/0x280 arch/x86/mm/pat/memtype.c:1033
remap_pfn_range+0x202/0xbf0 mm/memory.c:2130
dma_direct_mmap+0x197/0x260 kernel/dma/direct.c:453
dma_mmap_attrs+0xfe/0x150 kernel/dma/mapping.c:237
usbdev_mmap+0x3ae/0x730 drivers/usb/core/devio.c:254
call_mmap include/linux/fs.h:1912 [inline]
mmap_region+0xafb/0x1540 mm/mmap.c:1772
do_mmap+0x849/0x1160 mm/mmap.c:1545
do_mmap_pgoff include/linux/mm.h:2553 [inline]
vm_mmap_pgoff+0x197/0x200 mm/util.c:506
ksys_mmap_pgoff+0x457/0x5b0 mm/mmap.c:1595
do_syscall_32_irqs_on arch/x86/entry/common.c:337 [inline]
do_fast_syscall_32+0x270/0xe90 arch/x86/entry/common.c:396
entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches


2020-05-09 07:47:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: WARNING in memtype_reserve

On Sat, May 09, 2020 at 12:20:14AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit: d5eeab8d Merge tag 'scsi-fixes' of git://git.kernel.org/pu..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=15093632100000
> kernel config: https://syzkaller.appspot.com/x/.config?x=b0212dbee046bc1f
> dashboard link: https://syzkaller.appspot.com/bug?extid=353be47c9ce21b68b7ed
> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> userspace arch: i386
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=168ee02c100000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=119f3788100000
>
> The bug was bisected to:
>
> commit 2bef9aed6f0e22391c8d4570749b1acc9bc3981e
> Author: Jeremy Linton <[email protected]>
> Date: Mon May 4 20:13:48 2020 +0000
>
> usb: usbfs: correct kernel->user page attribute mismatch
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1701f168100000
> final crash: https://syzkaller.appspot.com/x/report.txt?x=1481f168100000
> console output: https://syzkaller.appspot.com/x/log.txt?x=1081f168100000
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: [email protected]
> Fixes: 2bef9aed6f0e ("usb: usbfs: correct kernel->user page attribute mismatch")
>
> ------------[ cut here ]------------
> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
> WARNING: CPU: 1 PID: 7025 at arch/x86/mm/pat/memtype.c:589 memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 1 PID: 7025 Comm: syz-executor254 Not tainted 5.7.0-rc4-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x188/0x20d lib/dump_stack.c:118
> panic+0x2e3/0x75c kernel/panic.c:221
> __warn.cold+0x2f/0x35 kernel/panic.c:582
> report_bug+0x27b/0x2f0 lib/bug.c:195
> fixup_bug arch/x86/kernel/traps.c:175 [inline]
> fixup_bug arch/x86/kernel/traps.c:170 [inline]
> do_error_trap+0x12b/0x220 arch/x86/kernel/traps.c:267
> do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:286
> invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
> RIP: 0010:memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
> Code: 48 8b 2c ed c0 00 29 88 e8 ae ad 3e 00 48 8d 4b ff 49 89 e8 4c 89 e2 48 c7 c6 20 01 29 88 48 c7 c7 80 f9 28 88 e8 79 e8 0f 00 <0f> 0b 41 bf ea ff ff ff e9 03 fc ff ff 41 bf ea ff ff ff e9 f8 fb
> RSP: 0018:ffffc900015e7790 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: 0000000000009000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff815ce181 RDI: fffff520002bcee4
> RBP: ffffffff8828ff40 R08: ffff888097ce85c0 R09: ffffed1015ce45f1
> R10: ffff8880ae722f83 R11: ffffed1015ce45f0 R12: 000ffffffffff000
> R13: 1ffff920002bcef8 R14: dffffc0000000000 R15: 0000000000000000
> reserve_pfn_range+0x173/0x470 arch/x86/mm/pat/memtype.c:941
> track_pfn_remap+0x18b/0x280 arch/x86/mm/pat/memtype.c:1033
> remap_pfn_range+0x202/0xbf0 mm/memory.c:2130
> dma_direct_mmap+0x197/0x260 kernel/dma/direct.c:453
> dma_mmap_attrs+0xfe/0x150 kernel/dma/mapping.c:237
> usbdev_mmap+0x3ae/0x730 drivers/usb/core/devio.c:254
> call_mmap include/linux/fs.h:1912 [inline]
> mmap_region+0xafb/0x1540 mm/mmap.c:1772
> do_mmap+0x849/0x1160 mm/mmap.c:1545
> do_mmap_pgoff include/linux/mm.h:2553 [inline]
> vm_mmap_pgoff+0x197/0x200 mm/util.c:506
> ksys_mmap_pgoff+0x457/0x5b0 mm/mmap.c:1595
> do_syscall_32_irqs_on arch/x86/entry/common.c:337 [inline]
> do_fast_syscall_32+0x270/0xe90 arch/x86/entry/common.c:396
> entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
> Kernel Offset: disabled
> Rebooting in 86400 seconds..

So should memtype_reserve() not do a WARN if given invalid parameters as
it can be triggered by userspace requests?

A normal "invalid request" debug line is probably all that is needed,
right?

thanks,

greg k-h

2020-05-09 10:03:23

by Thomas Gleixner

[permalink] [raw]
Subject: Re: WARNING in memtype_reserve

Greg KH <[email protected]> writes:
> On Sat, May 09, 2020 at 12:20:14AM -0700, syzbot wrote:
>> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
>> WARNING: CPU: 1 PID: 7025 at arch/x86/mm/pat/memtype.c:589 memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
>
> So should memtype_reserve() not do a WARN if given invalid parameters as
> it can be triggered by userspace requests?
>
> A normal "invalid request" debug line is probably all that is needed,
> right?

I disagree. The callsite espcially if user space triggerable should not
attempt to ask for a reservation where start > end:

>> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back

The real question is which part of the call chain is responsible for
this. That needs to be fixed.

Thanks,

tglx

2020-05-09 13:44:48

by Alan Stern

[permalink] [raw]
Subject: Re: WARNING in memtype_reserve

On Sat, 9 May 2020, Thomas Gleixner wrote:

> Greg KH <[email protected]> writes:
> > On Sat, May 09, 2020 at 12:20:14AM -0700, syzbot wrote:
> >> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
> >> WARNING: CPU: 1 PID: 7025 at arch/x86/mm/pat/memtype.c:589 memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
> >
> > So should memtype_reserve() not do a WARN if given invalid parameters as
> > it can be triggered by userspace requests?
> >
> > A normal "invalid request" debug line is probably all that is needed,
> > right?
>
> I disagree. The callsite espcially if user space triggerable should not
> attempt to ask for a reservation where start > end:
>
> >> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
>
> The real question is which part of the call chain is responsible for
> this. That needs to be fixed.

What about all the other ways in which a reservation request could be
invalid? The MM core already checks for these; what point is there in
duplicating these checks in many places higher up the call chain?

Alan Stern

2020-05-09 17:46:35

by Jeremy Linton

[permalink] [raw]
Subject: Re: WARNING in memtype_reserve

Hi,

On 5/9/20 2:20 AM, syzbot wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit: d5eeab8d Merge tag 'scsi-fixes' of git://git.kernel.org/pu..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=15093632100000
> kernel config: https://syzkaller.appspot.com/x/.config?x=b0212dbee046bc1f
> dashboard link: https://syzkaller.appspot.com/bug?extid=353be47c9ce21b68b7ed
> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> userspace arch: i386
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=168ee02c100000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=119f3788100000
>
> The bug was bisected to:
>
> commit 2bef9aed6f0e22391c8d4570749b1acc9bc3981e
> Author: Jeremy Linton <[email protected]>
> Date: Mon May 4 20:13:48 2020 +0000
>
> usb: usbfs: correct kernel->user page attribute mismatch
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1701f168100000
> final crash: https://syzkaller.appspot.com/x/report.txt?x=1481f168100000
> console output: https://syzkaller.appspot.com/x/log.txt?x=1081f168100000
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: [email protected]
> Fixes: 2bef9aed6f0e ("usb: usbfs: correct kernel->user page attribute mismatch")
>
> ------------[ cut here ]------------
> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
> WARNING: CPU: 1 PID: 7025 at arch/x86/mm/pat/memtype.c:589 memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 1 PID: 7025 Comm: syz-executor254 Not tainted 5.7.0-rc4-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x188/0x20d lib/dump_stack.c:118
> panic+0x2e3/0x75c kernel/panic.c:221
> __warn.cold+0x2f/0x35 kernel/panic.c:582
> report_bug+0x27b/0x2f0 lib/bug.c:195
> fixup_bug arch/x86/kernel/traps.c:175 [inline]
> fixup_bug arch/x86/kernel/traps.c:170 [inline]
> do_error_trap+0x12b/0x220 arch/x86/kernel/traps.c:267
> do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:286
> invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
> RIP: 0010:memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
> Code: 48 8b 2c ed c0 00 29 88 e8 ae ad 3e 00 48 8d 4b ff 49 89 e8 4c 89 e2 48 c7 c6 20 01 29 88 48 c7 c7 80 f9 28 88 e8 79 e8 0f 00 <0f> 0b 41 bf ea ff ff ff e9 03 fc ff ff 41 bf ea ff ff ff e9 f8 fb
> RSP: 0018:ffffc900015e7790 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: 0000000000009000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff815ce181 RDI: fffff520002bcee4
> RBP: ffffffff8828ff40 R08: ffff888097ce85c0 R09: ffffed1015ce45f1
> R10: ffff8880ae722f83 R11: ffffed1015ce45f0 R12: 000ffffffffff000
> R13: 1ffff920002bcef8 R14: dffffc0000000000 R15: 0000000000000000
> reserve_pfn_range+0x173/0x470 arch/x86/mm/pat/memtype.c:941
> track_pfn_remap+0x18b/0x280 arch/x86/mm/pat/memtype.c:1033
> remap_pfn_range+0x202/0xbf0 mm/memory.c:2130
> dma_direct_mmap+0x197/0x260 kernel/dma/direct.c:453
> dma_mmap_attrs+0xfe/0x150 kernel/dma/mapping.c:237
> usbdev_mmap+0x3ae/0x730 drivers/usb/core/devio.c:254
> call_mmap include/linux/fs.h:1912 [inline]
> mmap_region+0xafb/0x1540 mm/mmap.c:1772
> do_mmap+0x849/0x1160 mm/mmap.c:1545
> do_mmap_pgoff include/linux/mm.h:2553 [inline]
> vm_mmap_pgoff+0x197/0x200 mm/util.c:506
> ksys_mmap_pgoff+0x457/0x5b0 mm/mmap.c:1595
> do_syscall_32_irqs_on arch/x86/entry/common.c:337 [inline]
> do_fast_syscall_32+0x270/0xe90 arch/x86/entry/common.c:396
> entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
>
>
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at [email protected].
>
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
> syzbot can test patches for this bug, for details see:
> https://goo.gl/tpsmEJ#testing-patches
>

Adding a range check in dma_direct_mmap() looks like a fine fix. I'm
curious how the mapping attribute changed though. I expected that would
be rare on x86 (ISA devices/etc).

What is the GCE USB controller here? Its not the same as the virtual
qemu/pci/xhci right?

2020-05-13 12:44:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: WARNING in memtype_reserve

On Sat, May 09, 2020 at 11:47:28PM +0800, Hillf Danton wrote:
>
> Sat, 09 May 2020 00:20:14 -0700
> >syzbot found the following crash on:
> >
> >HEAD commit: d5eeab8d Merge tag 'scsi-fixes' of git://git.kernel.org/pu..
> >git tree: upstream
> >console output: https://syzkaller.appspot.com/x/log.txt?x=15093632100000
> >kernel config: https://syzkaller.appspot.com/x/.config?x=b0212dbee046bc1f
> >dashboard link: https://syzkaller.appspot.com/bug?extid=353be47c9ce21b68b7ed
> >compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> >userspace arch: i386
> >syz repro: https://syzkaller.appspot.com/x/repro.syz?x=168ee02c100000
> >C reproducer: https://syzkaller.appspot.com/x/repro.c?x=119f3788100000
> >
> >The bug was bisected to:
> >
> >commit 2bef9aed6f0e22391c8d4570749b1acc9bc3981e
> >Author: Jeremy Linton <[email protected]>
> >Date: Mon May 4 20:13:48 2020 +0000
> >
> > usb: usbfs: correct kernel->user page attribute mismatch
> >
> >bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1701f168100000
> >final crash: https://syzkaller.appspot.com/x/report.txt?x=1481f168100000
> >console output: https://syzkaller.appspot.com/x/log.txt?x=1081f168100000
> >
> >IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >Reported-by: [email protected]
> >Fixes: 2bef9aed6f0e ("usb: usbfs: correct kernel->user page attribute mismatch")
> >
> >------------[ cut here ]------------
> >memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
> >WARNING: CPU: 1 PID: 7025 at arch/x86/mm/pat/memtype.c:589 memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
> >Kernel panic - not syncing: panic_on_warn set ...
> >CPU: 1 PID: 7025 Comm: syz-executor254 Not tainted 5.7.0-rc4-syzkaller #0
> >Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> >Call Trace:
> > __dump_stack lib/dump_stack.c:77 [inline]
> > dump_stack+0x188/0x20d lib/dump_stack.c:118
> > panic+0x2e3/0x75c kernel/panic.c:221
> > __warn.cold+0x2f/0x35 kernel/panic.c:582
> > report_bug+0x27b/0x2f0 lib/bug.c:195
> > fixup_bug arch/x86/kernel/traps.c:175 [inline]
> > fixup_bug arch/x86/kernel/traps.c:170 [inline]
> > do_error_trap+0x12b/0x220 arch/x86/kernel/traps.c:267
> > do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:286
> > invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
> >RIP: 0010:memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
> >Code: 48 8b 2c ed c0 00 29 88 e8 ae ad 3e 00 48 8d 4b ff 49 89 e8 4c 89 e2 48 c7 c6 20 01 29 88 48 c7 c7 80 f9 28 88 e8 79 e8 0f 00 <0f> 0b 41 bf ea ff ff ff e9 03 fc ff ff 41 bf ea ff ff ff e9 f8 fb
> >RSP: 0018:ffffc900015e7790 EFLAGS: 00010282
> >RAX: 0000000000000000 RBX: 0000000000009000 RCX: 0000000000000000
> >RDX: 0000000000000000 RSI: ffffffff815ce181 RDI: fffff520002bcee4
> >RBP: ffffffff8828ff40 R08: ffff888097ce85c0 R09: ffffed1015ce45f1
> >R10: ffff8880ae722f83 R11: ffffed1015ce45f0 R12: 000ffffffffff000
> >R13: 1ffff920002bcef8 R14: dffffc0000000000 R15: 0000000000000000
> > reserve_pfn_range+0x173/0x470 arch/x86/mm/pat/memtype.c:941
> > track_pfn_remap+0x18b/0x280 arch/x86/mm/pat/memtype.c:1033
> > remap_pfn_range+0x202/0xbf0 mm/memory.c:2130
> > dma_direct_mmap+0x197/0x260 kernel/dma/direct.c:453
> > dma_mmap_attrs+0xfe/0x150 kernel/dma/mapping.c:237
> > usbdev_mmap+0x3ae/0x730 drivers/usb/core/devio.c:254
> > call_mmap include/linux/fs.h:1912 [inline]
> > mmap_region+0xafb/0x1540 mm/mmap.c:1772
> > do_mmap+0x849/0x1160 mm/mmap.c:1545
> > do_mmap_pgoff include/linux/mm.h:2553 [inline]
> > vm_mmap_pgoff+0x197/0x200 mm/util.c:506
> > ksys_mmap_pgoff+0x457/0x5b0 mm/mmap.c:1595
> > do_syscall_32_irqs_on arch/x86/entry/common.c:337 [inline]
> > do_fast_syscall_32+0x270/0xe90 arch/x86/entry/common.c:396
> > entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
>
> Add check of physical address and boundary.
>
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -447,6 +447,7 @@ int dma_direct_mmap(struct device *dev,
> unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> unsigned long pfn = PHYS_PFN(dma_to_phys(dev, dma_addr));
> int ret = -ENXIO;
> + unsigned long pa;
>
> vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
>
> @@ -455,6 +456,13 @@ int dma_direct_mmap(struct device *dev,
>
> if (vma->vm_pgoff >= count || user_count > count - vma->vm_pgoff)
> return -ENXIO;
> + if (pfn > pfn + vma->vm_pgoff)
> + return -ENXIO;
> + pa = (pfn + vma->vm_pgoff) << PAGE_SHIFT;
> + if (pa < (pfn << PAGE_SHIFT))
> + return -ENXIO;
> + if (pa > pa + (user_count << PAGE_SHIFT))
> + return -ENXIO;
> return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
> user_count << PAGE_SHIFT, vma->vm_page_prot);


Isn't there some sort of "check mmap parms" function somewhere that
should do this logic all in one place? Putting it all over the kernel
feels wrong to me...

thanks,

greg k-h

2020-05-13 12:47:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: WARNING in memtype_reserve

On Sat, May 09, 2020 at 12:00:57PM +0200, Thomas Gleixner wrote:
> Greg KH <[email protected]> writes:
> > On Sat, May 09, 2020 at 12:20:14AM -0700, syzbot wrote:
> >> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
> >> WARNING: CPU: 1 PID: 7025 at arch/x86/mm/pat/memtype.c:589 memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
> >
> > So should memtype_reserve() not do a WARN if given invalid parameters as
> > it can be triggered by userspace requests?
> >
> > A normal "invalid request" debug line is probably all that is needed,
> > right?
>
> I disagree. The callsite espcially if user space triggerable should not
> attempt to ask for a reservation where start > end:
>
> >> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
>
> The real question is which part of the call chain is responsible for
> this. That needs to be fixed.

This is caused by 2bef9aed6f0e ("usb: usbfs: correct kernel->user page
attribute mismatch") which changed a call to remap_pfn_range() to
dma_mmap_coherent(). Looks like the error checking in remap_pfn_range()
handled the invalid options better than dma_mma_coherent() when odd
values are passed in.

We can add the check to dma_mmap_coherent(), again, but really, this
type of check should probably only be needed in one place to ensure we
always get it correct, right?

thanks,

greg k-h

2020-05-13 16:26:29

by Thomas Gleixner

[permalink] [raw]
Subject: Re: WARNING in memtype_reserve

Alan Stern <[email protected]> writes:
> On Sat, 9 May 2020, Thomas Gleixner wrote:
>
>> Greg KH <[email protected]> writes:
>> > On Sat, May 09, 2020 at 12:20:14AM -0700, syzbot wrote:
>> >> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
>> >> WARNING: CPU: 1 PID: 7025 at arch/x86/mm/pat/memtype.c:589 memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
>> >
>> > So should memtype_reserve() not do a WARN if given invalid parameters as
>> > it can be triggered by userspace requests?
>> >
>> > A normal "invalid request" debug line is probably all that is needed,
>> > right?
>>
>> I disagree. The callsite espcially if user space triggerable should not
>> attempt to ask for a reservation where start > end:
>>
>> >> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
>>
>> The real question is which part of the call chain is responsible for
>> this. That needs to be fixed.
>
> What about all the other ways in which a reservation request could be
> invalid? The MM core already checks for these; what point is there in
> duplicating these checks in many places higher up the call chain?

Defensive programming rule #1: Check crap early but have the check which
ultimatively catches it at the last possible place as well.

2020-05-13 16:58:15

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: WARNING in memtype_reserve

On Wed, May 13, 2020 at 06:22:58PM +0200, Thomas Gleixner wrote:
> Greg KH <[email protected]> writes:
> > On Sat, May 09, 2020 at 12:00:57PM +0200, Thomas Gleixner wrote:
> >> Greg KH <[email protected]> writes:
> >> > On Sat, May 09, 2020 at 12:20:14AM -0700, syzbot wrote:
> >> >> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
> >> >> WARNING: CPU: 1 PID: 7025 at arch/x86/mm/pat/memtype.c:589 memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
> >> >
> >> > So should memtype_reserve() not do a WARN if given invalid parameters as
> >> > it can be triggered by userspace requests?
> >> >
> >> > A normal "invalid request" debug line is probably all that is needed,
> >> > right?
> >>
> >> I disagree. The callsite espcially if user space triggerable should not
> >> attempt to ask for a reservation where start > end:
> >>
> >> >> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
> >>
> >> The real question is which part of the call chain is responsible for
> >> this. That needs to be fixed.
> >
> > This is caused by 2bef9aed6f0e ("usb: usbfs: correct kernel->user page
> > attribute mismatch") which changed a call to remap_pfn_range() to
> > dma_mmap_coherent(). Looks like the error checking in remap_pfn_range()
> > handled the invalid options better than dma_mma_coherent() when odd
> > values are passed in.
> >
> > We can add the check to dma_mmap_coherent(), again, but really, this
> > type of check should probably only be needed in one place to ensure we
> > always get it correct, right?
>
> That might be correct for this particular call chain, but this check
> really is the last defense before stuff goes down the drain. None of the
> last line functions should ever be reached with crappy arguments.

Looking at the other callers of dma_mmap_coherent(), it looks like this
needs to be done in that function, as other drivers are passing in "raw"
data as well. So Hillf's patch is probably the best one.

Hillf, can you resend it in a format we can apply it in and have syzbot
test?

thanks,

greg k-h

2020-05-13 20:50:20

by Thomas Gleixner

[permalink] [raw]
Subject: Re: WARNING in memtype_reserve

Greg KH <[email protected]> writes:
> On Sat, May 09, 2020 at 12:00:57PM +0200, Thomas Gleixner wrote:
>> Greg KH <[email protected]> writes:
>> > On Sat, May 09, 2020 at 12:20:14AM -0700, syzbot wrote:
>> >> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
>> >> WARNING: CPU: 1 PID: 7025 at arch/x86/mm/pat/memtype.c:589 memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
>> >
>> > So should memtype_reserve() not do a WARN if given invalid parameters as
>> > it can be triggered by userspace requests?
>> >
>> > A normal "invalid request" debug line is probably all that is needed,
>> > right?
>>
>> I disagree. The callsite espcially if user space triggerable should not
>> attempt to ask for a reservation where start > end:
>>
>> >> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
>>
>> The real question is which part of the call chain is responsible for
>> this. That needs to be fixed.
>
> This is caused by 2bef9aed6f0e ("usb: usbfs: correct kernel->user page
> attribute mismatch") which changed a call to remap_pfn_range() to
> dma_mmap_coherent(). Looks like the error checking in remap_pfn_range()
> handled the invalid options better than dma_mma_coherent() when odd
> values are passed in.
>
> We can add the check to dma_mmap_coherent(), again, but really, this
> type of check should probably only be needed in one place to ensure we
> always get it correct, right?

That might be correct for this particular call chain, but this check
really is the last defense before stuff goes down the drain. None of the
last line functions should ever be reached with crappy arguments.

Thanks,

tglx

2020-05-14 06:16:04

by Christoph Hellwig

[permalink] [raw]
Subject: Re: WARNING in memtype_reserve

Guys, can you please start formal thread on this? I have no
idea where this came from and what the rationale is. Btw, if the
pfn is crap in dma_direct_mmap then the dma_addr_t passed in is
crap, as it is derived from that. What is the caller, and how is
this triggered?

2020-05-14 06:21:27

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: WARNING in memtype_reserve

On Thu, May 14, 2020 at 8:14 AM Christoph Hellwig <[email protected]> wrote:
>
> Guys, can you please start formal thread on this? I have no
> idea where this came from and what the rationale is. Btw, if the
> pfn is crap in dma_direct_mmap then the dma_addr_t passed in is
> crap, as it is derived from that. What is the caller, and how is
> this triggered?

FWIW the whole thread is available on lore:
https://lore.kernel.org/lkml/[email protected]/T/#t

2020-05-14 06:29:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Validating dma_mmap_coherent() parameters before calling (was Re: WARNING in memtype_reserve)

On Thu, May 14, 2020 at 08:14:17AM +0200, Christoph Hellwig wrote:
> Guys, can you please start formal thread on this? I have no
> idea where this came from and what the rationale is. Btw, if the
> pfn is crap in dma_direct_mmap then the dma_addr_t passed in is
> crap, as it is derived from that. What is the caller, and how is
> this triggered?


Ok, to summarize, commit 2bef9aed6f0e ("usb: usbfs: correct kernel->user
page attribute mismatch") changed a call from remap_pfn_range() to
dma_mmap_coherent() for usb data buffers being sent from userspace.

Details on why this was changed is in the commit changelog, but this has
triggered a WARN_ON() in memtype_reserve when I think some odd data
buffers were sent to it by syzbot fuzzing.

The warning caught the wrong data, but triggered syzbot.

So, the question is, should all callers of dma_mmap_coherent() validate
the parms better before it is called, or should the call chain here
properly sanitize things on their own.

Note, usbfs is not the only direct-from-userspace caller of
dma_mmap_coherent(), it's also used in other userspace apis (frame
buffer drivers, fastrpc, some SoC drivers) but it looks like nothing has
fuzzed those apis before to trigger this.

Does that help explain things better?

thanks,

greg k-h

2020-05-14 06:34:23

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Validating dma_mmap_coherent() parameters before calling (was Re: WARNING in memtype_reserve)

On Thu, May 14, 2020 at 08:27:50AM +0200, Greg KH wrote:
> On Thu, May 14, 2020 at 08:14:17AM +0200, Christoph Hellwig wrote:
> > Guys, can you please start formal thread on this? I have no
> > idea where this came from and what the rationale is. Btw, if the
> > pfn is crap in dma_direct_mmap then the dma_addr_t passed in is
> > crap, as it is derived from that. What is the caller, and how is
> > this triggered?
>
>
> Ok, to summarize, commit 2bef9aed6f0e ("usb: usbfs: correct kernel->user
> page attribute mismatch") changed a call from remap_pfn_range() to
> dma_mmap_coherent() for usb data buffers being sent from userspace.

I only need to look at the commit for 3 seconds to tell you that it is
completely buggy. While using dma_mmap_coherent is fundamentally the
right thing and absolutely required for dma_alloc_* allocations, USB
also uses it's own local gen pool allocator or plain kmalloc for not
DMA capable controller. This need to use remap_pfn_range. I'm pretty
sure you hit one of those cases.

The logic should be something like:

if (hcd->localmem_pool || !hcd_uses_dma(hcd))
remap_pfn_range()
else
dma_mmap_coherent()

2020-05-14 07:49:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Validating dma_mmap_coherent() parameters before calling (was Re: WARNING in memtype_reserve)

On Thu, May 14, 2020 at 08:31:58AM +0200, Christoph Hellwig wrote:
> On Thu, May 14, 2020 at 08:27:50AM +0200, Greg KH wrote:
> > On Thu, May 14, 2020 at 08:14:17AM +0200, Christoph Hellwig wrote:
> > > Guys, can you please start formal thread on this? I have no
> > > idea where this came from and what the rationale is. Btw, if the
> > > pfn is crap in dma_direct_mmap then the dma_addr_t passed in is
> > > crap, as it is derived from that. What is the caller, and how is
> > > this triggered?
> >
> >
> > Ok, to summarize, commit 2bef9aed6f0e ("usb: usbfs: correct kernel->user
> > page attribute mismatch") changed a call from remap_pfn_range() to
> > dma_mmap_coherent() for usb data buffers being sent from userspace.
>
> I only need to look at the commit for 3 seconds to tell you that it is
> completely buggy. While using dma_mmap_coherent is fundamentally the
> right thing and absolutely required for dma_alloc_* allocations, USB
> also uses it's own local gen pool allocator or plain kmalloc for not
> DMA capable controller. This need to use remap_pfn_range. I'm pretty
> sure you hit one of those cases.
>
> The logic should be something like:
>
> if (hcd->localmem_pool || !hcd_uses_dma(hcd))
> remap_pfn_range()
> else
> dma_mmap_coherent()

Ok, that's simple enough, patch is below.

Jeremy, any objection to this change?

thanks,

greg k-h


diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index b9db9812d6c5..d93d94d7ff50 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -251,9 +251,19 @@ static int usbdev_mmap(struct file *file, struct vm_area_struct *vma)
usbm->vma_use_count = 1;
INIT_LIST_HEAD(&usbm->memlist);

- if (dma_mmap_coherent(hcd->self.sysdev, vma, mem, dma_handle, size)) {
- dec_usb_memory_use_count(usbm, &usbm->vma_use_count);
- return -EAGAIN;
+ if (hcd->localmem_pool || !hcd_uses_dma(hcd)) {
+ if (remap_pfn_range(vma, vma->vm_start,
+ virt_to_phys(usbm->mem) >> PAGE_SHIFT,
+ size, vma->vm_page_prot) < 0) {
+ dec_usb_memory_use_count(usbm, &usbm->vma_use_count);
+ return -EAGAIN;
+ }
+ } else {
+ if (dma_mmap_coherent(hcd->self.sysdev, vma, mem, dma_handle,
+ size)) {
+ dec_usb_memory_use_count(usbm, &usbm->vma_use_count);
+ return -EAGAIN;
+ }
}

vma->vm_flags |= VM_IO;

2020-05-14 09:13:14

by syzbot

[permalink] [raw]
Subject: Re: WARNING in memtype_reserve

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger crash:

Reported-and-tested-by: [email protected]

Tested on:

commit: d5eeab8d Merge tag 'scsi-fixes' of git://git.kernel.org/pu..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel config: https://syzkaller.appspot.com/x/.config?x=b0212dbee046bc1f
dashboard link: https://syzkaller.appspot.com/bug?extid=353be47c9ce21b68b7ed
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
patch: https://syzkaller.appspot.com/x/patch.diff?x=1046a052100000

Note: testing is done by a robot and is best-effort only.

2020-05-14 10:32:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: WARNING in memtype_reserve

On Sat, May 09, 2020 at 12:20:14AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit: d5eeab8d Merge tag 'scsi-fixes' of git://git.kernel.org/pu..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=15093632100000
> kernel config: https://syzkaller.appspot.com/x/.config?x=b0212dbee046bc1f
> dashboard link: https://syzkaller.appspot.com/bug?extid=353be47c9ce21b68b7ed
> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> userspace arch: i386
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=168ee02c100000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=119f3788100000
>
> The bug was bisected to:
>
> commit 2bef9aed6f0e22391c8d4570749b1acc9bc3981e
> Author: Jeremy Linton <[email protected]>
> Date: Mon May 4 20:13:48 2020 +0000
>
> usb: usbfs: correct kernel->user page attribute mismatch
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1701f168100000
> final crash: https://syzkaller.appspot.com/x/report.txt?x=1481f168100000
> console output: https://syzkaller.appspot.com/x/log.txt?x=1081f168100000
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: [email protected]
> Fixes: 2bef9aed6f0e ("usb: usbfs: correct kernel->user page attribute mismatch")
>
> ------------[ cut here ]------------
> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
> WARNING: CPU: 1 PID: 7025 at arch/x86/mm/pat/memtype.c:589 memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 1 PID: 7025 Comm: syz-executor254 Not tainted 5.7.0-rc4-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x188/0x20d lib/dump_stack.c:118
> panic+0x2e3/0x75c kernel/panic.c:221
> __warn.cold+0x2f/0x35 kernel/panic.c:582
> report_bug+0x27b/0x2f0 lib/bug.c:195
> fixup_bug arch/x86/kernel/traps.c:175 [inline]
> fixup_bug arch/x86/kernel/traps.c:170 [inline]
> do_error_trap+0x12b/0x220 arch/x86/kernel/traps.c:267
> do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:286
> invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
> RIP: 0010:memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
> Code: 48 8b 2c ed c0 00 29 88 e8 ae ad 3e 00 48 8d 4b ff 49 89 e8 4c 89 e2 48 c7 c6 20 01 29 88 48 c7 c7 80 f9 28 88 e8 79 e8 0f 00 <0f> 0b 41 bf ea ff ff ff e9 03 fc ff ff 41 bf ea ff ff ff e9 f8 fb
> RSP: 0018:ffffc900015e7790 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: 0000000000009000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff815ce181 RDI: fffff520002bcee4
> RBP: ffffffff8828ff40 R08: ffff888097ce85c0 R09: ffffed1015ce45f1
> R10: ffff8880ae722f83 R11: ffffed1015ce45f0 R12: 000ffffffffff000
> R13: 1ffff920002bcef8 R14: dffffc0000000000 R15: 0000000000000000
> reserve_pfn_range+0x173/0x470 arch/x86/mm/pat/memtype.c:941
> track_pfn_remap+0x18b/0x280 arch/x86/mm/pat/memtype.c:1033
> remap_pfn_range+0x202/0xbf0 mm/memory.c:2130
> dma_direct_mmap+0x197/0x260 kernel/dma/direct.c:453
> dma_mmap_attrs+0xfe/0x150 kernel/dma/mapping.c:237
> usbdev_mmap+0x3ae/0x730 drivers/usb/core/devio.c:254
> call_mmap include/linux/fs.h:1912 [inline]
> mmap_region+0xafb/0x1540 mm/mmap.c:1772
> do_mmap+0x849/0x1160 mm/mmap.c:1545
> do_mmap_pgoff include/linux/mm.h:2553 [inline]
> vm_mmap_pgoff+0x197/0x200 mm/util.c:506
> ksys_mmap_pgoff+0x457/0x5b0 mm/mmap.c:1595
> do_syscall_32_irqs_on arch/x86/entry/common.c:337 [inline]
> do_fast_syscall_32+0x270/0xe90 arch/x86/entry/common.c:396
> entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
> Kernel Offset: disabled
> Rebooting in 86400 seconds..

#syz test: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git d5eeab8d

diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index b9db9812d6c5..d93d94d7ff50 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -251,9 +251,19 @@ static int usbdev_mmap(struct file *file, struct vm_area_struct *vma)
usbm->vma_use_count = 1;
INIT_LIST_HEAD(&usbm->memlist);

- if (dma_mmap_coherent(hcd->self.sysdev, vma, mem, dma_handle, size)) {
- dec_usb_memory_use_count(usbm, &usbm->vma_use_count);
- return -EAGAIN;
+ if (hcd->localmem_pool || !hcd_uses_dma(hcd)) {
+ if (remap_pfn_range(vma, vma->vm_start,
+ virt_to_phys(usbm->mem) >> PAGE_SHIFT,
+ size, vma->vm_page_prot) < 0) {
+ dec_usb_memory_use_count(usbm, &usbm->vma_use_count);
+ return -EAGAIN;
+ }
+ } else {
+ if (dma_mmap_coherent(hcd->self.sysdev, vma, mem, dma_handle,
+ size)) {
+ dec_usb_memory_use_count(usbm, &usbm->vma_use_count);
+ return -EAGAIN;
+ }
}

vma->vm_flags |= VM_IO;

2020-05-14 10:46:26

by syzbot

[permalink] [raw]
Subject: Re: WARNING in memtype_reserve

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger crash:

Reported-and-tested-by: [email protected]

Tested on:

commit: d5eeab8d Merge tag 'scsi-fixes' of git://git.kernel.org/pu..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel config: https://syzkaller.appspot.com/x/.config?x=b0212dbee046bc1f
dashboard link: https://syzkaller.appspot.com/bug?extid=353be47c9ce21b68b7ed
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
patch: https://syzkaller.appspot.com/x/patch.diff?x=12bd6806100000

Note: testing is done by a robot and is best-effort only.

2020-05-14 11:12:11

by Jeremy Linton

[permalink] [raw]
Subject: Re: Validating dma_mmap_coherent() parameters before calling (was Re: WARNING in memtype_reserve)

Hi,

On 5/14/20 1:31 AM, Christoph Hellwig wrote:
> On Thu, May 14, 2020 at 08:27:50AM +0200, Greg KH wrote:
>> On Thu, May 14, 2020 at 08:14:17AM +0200, Christoph Hellwig wrote:
>>> Guys, can you please start formal thread on this? I have no
>>> idea where this came from and what the rationale is. Btw, if the
>>> pfn is crap in dma_direct_mmap then the dma_addr_t passed in is
>>> crap, as it is derived from that. What is the caller, and how is
>>> this triggered?
>>
>>
>> Ok, to summarize, commit 2bef9aed6f0e ("usb: usbfs: correct kernel->user
>> page attribute mismatch") changed a call from remap_pfn_range() to
>> dma_mmap_coherent() for usb data buffers being sent from userspace.
>
> I only need to look at the commit for 3 seconds to tell you that it is
> completely buggy. While using dma_mmap_coherent is fundamentally the
> right thing and absolutely required for dma_alloc_* allocations, USB
> also uses it's own local gen pool allocator or plain kmalloc for not
> DMA capable controller. This need to use remap_pfn_range. I'm pretty
> sure you hit one of those cases.

? The code path in question is usbdev_mmap() and the allocation is done
~13 lines lines before as a usb_alloc_coherent().


>
> The logic should be something like:
>
> if (hcd->localmem_pool || !hcd_uses_dma(hcd))
> remap_pfn_range()
> else
> dma_mmap_coherent()
>

That sort of makes sense, except for the above, and the fact that I
would imagine the dma_mmap_coherent should be dealing with that case.
I'm not really clear about the details of the GCE usb device here, but
my first guess at this was the dma_pgprot() in dma_direct_mmap() is
incorrectly picking a pgprot...

2020-05-14 11:16:38

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Validating dma_mmap_coherent() parameters before calling (was Re: WARNING in memtype_reserve)

On Thu, May 14, 2020 at 06:10:03AM -0500, Jeremy Linton wrote:
>> I only need to look at the commit for 3 seconds to tell you that it is
>> completely buggy. While using dma_mmap_coherent is fundamentally the
>> right thing and absolutely required for dma_alloc_* allocations, USB
>> also uses it's own local gen pool allocator or plain kmalloc for not
>> DMA capable controller. This need to use remap_pfn_range. I'm pretty
>> sure you hit one of those cases.
>
> ? The code path in question is usbdev_mmap() and the allocation is done ~13
> lines lines before as a usb_alloc_coherent().

And did you take a look at how usb_alloc_coherent is implemented? That
should make it completely obvious that not all allocations come
from dma_alloc_*.

> That sort of makes sense, except for the above, and the fact that I would
> imagine the dma_mmap_coherent should be dealing with that case. I'm not
> really clear about the details of the GCE usb device here, but my first
> guess at this was the dma_pgprot() in dma_direct_mmap() is incorrectly
> picking a pgprot...

No, dma_mmap_* / dma_direct_mmap has absolutely no business dealing
with memory that did not come from the DMA allocator.

2020-05-14 11:18:42

by Jeremy Linton

[permalink] [raw]
Subject: Re: Validating dma_mmap_coherent() parameters before calling (was Re: WARNING in memtype_reserve)

On 5/14/20 6:14 AM, Christoph Hellwig wrote:
> On Thu, May 14, 2020 at 06:10:03AM -0500, Jeremy Linton wrote:
>>> I only need to look at the commit for 3 seconds to tell you that it is
>>> completely buggy. While using dma_mmap_coherent is fundamentally the
>>> right thing and absolutely required for dma_alloc_* allocations, USB
>>> also uses it's own local gen pool allocator or plain kmalloc for not
>>> DMA capable controller. This need to use remap_pfn_range. I'm pretty
>>> sure you hit one of those cases.
>>
>> ? The code path in question is usbdev_mmap() and the allocation is done ~13
>> lines lines before as a usb_alloc_coherent().
>
> And did you take a look at how usb_alloc_coherent is implemented? That
> should make it completely obvious that not all allocations come
> from dma_alloc_*.

No, your right, I noticed/remembered the usb_alloc vs dma_alloc
difference right after sending that email, and was just about to say so.

Sorry, you right.

>
>> That sort of makes sense, except for the above, and the fact that I would
>> imagine the dma_mmap_coherent should be dealing with that case. I'm not
>> really clear about the details of the GCE usb device here, but my first
>> guess at this was the dma_pgprot() in dma_direct_mmap() is incorrectly
>> picking a pgprot...
>
> No, dma_mmap_* / dma_direct_mmap has absolutely no business dealing
> with memory that did not come from the DMA allocator.
>

2020-05-14 11:21:41

by Jeremy Linton

[permalink] [raw]
Subject: Re: Validating dma_mmap_coherent() parameters before calling (was Re: WARNING in memtype_reserve)

On 5/14/20 2:46 AM, Greg KH wrote:
> On Thu, May 14, 2020 at 08:31:58AM +0200, Christoph Hellwig wrote:
>> On Thu, May 14, 2020 at 08:27:50AM +0200, Greg KH wrote:
>>> On Thu, May 14, 2020 at 08:14:17AM +0200, Christoph Hellwig wrote:
>>>> Guys, can you please start formal thread on this? I have no
>>>> idea where this came from and what the rationale is. Btw, if the
>>>> pfn is crap in dma_direct_mmap then the dma_addr_t passed in is
>>>> crap, as it is derived from that. What is the caller, and how is
>>>> this triggered?
>>>
>>>
>>> Ok, to summarize, commit 2bef9aed6f0e ("usb: usbfs: correct kernel->user
>>> page attribute mismatch") changed a call from remap_pfn_range() to
>>> dma_mmap_coherent() for usb data buffers being sent from userspace.
>>
>> I only need to look at the commit for 3 seconds to tell you that it is
>> completely buggy. While using dma_mmap_coherent is fundamentally the
>> right thing and absolutely required for dma_alloc_* allocations, USB
>> also uses it's own local gen pool allocator or plain kmalloc for not
>> DMA capable controller. This need to use remap_pfn_range. I'm pretty
>> sure you hit one of those cases.
>>
>> The logic should be something like:
>>
>> if (hcd->localmem_pool || !hcd_uses_dma(hcd))
>> remap_pfn_range()
>> else
>> dma_mmap_coherent()
>
> Ok, that's simple enough, patch is below.
>
> Jeremy, any objection to this change?

No, thats fine but since I just translated usb_alloc_coherent() to
dma_map_coherent in my not fully away head. Putting this as
"usb_map_cohernet()" sort of makes more sense.


>
> thanks,
>
> greg k-h
>
>
> diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
> index b9db9812d6c5..d93d94d7ff50 100644
> --- a/drivers/usb/core/devio.c
> +++ b/drivers/usb/core/devio.c
> @@ -251,9 +251,19 @@ static int usbdev_mmap(struct file *file, struct vm_area_struct *vma)
> usbm->vma_use_count = 1;
> INIT_LIST_HEAD(&usbm->memlist);
>
> - if (dma_mmap_coherent(hcd->self.sysdev, vma, mem, dma_handle, size)) {
> - dec_usb_memory_use_count(usbm, &usbm->vma_use_count);
> - return -EAGAIN;
> + if (hcd->localmem_pool || !hcd_uses_dma(hcd)) {
> + if (remap_pfn_range(vma, vma->vm_start,
> + virt_to_phys(usbm->mem) >> PAGE_SHIFT,
> + size, vma->vm_page_prot) < 0) {
> + dec_usb_memory_use_count(usbm, &usbm->vma_use_count);
> + return -EAGAIN;
> + }
> + } else {
> + if (dma_mmap_coherent(hcd->self.sysdev, vma, mem, dma_handle,
> + size)) {
> + dec_usb_memory_use_count(usbm, &usbm->vma_use_count);
> + return -EAGAIN;
> + }
> }
>
> vma->vm_flags |= VM_IO;
>

2020-05-14 11:24:08

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Validating dma_mmap_coherent() parameters before calling (was Re: WARNING in memtype_reserve)

On Thu, May 14, 2020 at 06:17:41AM -0500, Jeremy Linton wrote:
> On 5/14/20 2:46 AM, Greg KH wrote:
> > On Thu, May 14, 2020 at 08:31:58AM +0200, Christoph Hellwig wrote:
> > > On Thu, May 14, 2020 at 08:27:50AM +0200, Greg KH wrote:
> > > > On Thu, May 14, 2020 at 08:14:17AM +0200, Christoph Hellwig wrote:
> > > > > Guys, can you please start formal thread on this? I have no
> > > > > idea where this came from and what the rationale is. Btw, if the
> > > > > pfn is crap in dma_direct_mmap then the dma_addr_t passed in is
> > > > > crap, as it is derived from that. What is the caller, and how is
> > > > > this triggered?
> > > >
> > > >
> > > > Ok, to summarize, commit 2bef9aed6f0e ("usb: usbfs: correct kernel->user
> > > > page attribute mismatch") changed a call from remap_pfn_range() to
> > > > dma_mmap_coherent() for usb data buffers being sent from userspace.
> > >
> > > I only need to look at the commit for 3 seconds to tell you that it is
> > > completely buggy. While using dma_mmap_coherent is fundamentally the
> > > right thing and absolutely required for dma_alloc_* allocations, USB
> > > also uses it's own local gen pool allocator or plain kmalloc for not
> > > DMA capable controller. This need to use remap_pfn_range. I'm pretty
> > > sure you hit one of those cases.
> > >
> > > The logic should be something like:
> > >
> > > if (hcd->localmem_pool || !hcd_uses_dma(hcd))
> > > remap_pfn_range()
> > > else
> > > dma_mmap_coherent()
> >
> > Ok, that's simple enough, patch is below.
> >
> > Jeremy, any objection to this change?
>
> No, thats fine but since I just translated usb_alloc_coherent() to
> dma_map_coherent in my not fully away head. Putting this as
> "usb_map_cohernet()" sort of makes more sense.

Thanks, I'll turn this into a "real" patch now...

greg k-h