2016-12-12 20:31:11

by Andrey Konovalov

[permalink] [raw]
Subject: usb/gadget: warning in ep_write_iter/__alloc_pages_nodemask

Hi!

While running the syzkaller fuzzer I've got the following error report.

The issue is that the len argument is not checked for being too big.

WARNING: CPU: 1 PID: 9935 at mm/page_alloc.c:3511
__alloc_pages_nodemask+0x159c/0x1e20
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 9935 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #34
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffff88006949f2c8 ffffffff81f96b8a ffffffff00000200 1ffff1000d293dec
ffffed000d293de4 0000000000000a06 0000000041b58ab3 ffffffff8598b510
ffffffff81f968f8 0000000041b58ab3 ffffffff85942a58 ffffffff81432860
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff81f96b8a>] dump_stack+0x292/0x398 lib/dump_stack.c:51
[<ffffffff8168c88e>] panic+0x1cb/0x3a9 kernel/panic.c:179
[<ffffffff812b80b4>] __warn+0x1c4/0x1e0 kernel/panic.c:542
[<ffffffff812b831c>] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585
[< inline >] __alloc_pages_slowpath mm/page_alloc.c:3511
[<ffffffff816c08ac>] __alloc_pages_nodemask+0x159c/0x1e20 mm/page_alloc.c:3781
[<ffffffff817cde17>] alloc_pages_current+0x1c7/0x6b0 mm/mempolicy.c:2072
[< inline >] alloc_pages include/linux/gfp.h:469
[<ffffffff8172fd8f>] kmalloc_order+0x1f/0x70 mm/slab_common.c:1015
[<ffffffff8172fdff>] kmalloc_order_trace+0x1f/0x160 mm/slab_common.c:1026
[< inline >] kmalloc_large include/linux/slab.h:422
[<ffffffff817e01f0>] __kmalloc+0x210/0x2d0 mm/slub.c:3723
[< inline >] kmalloc include/linux/slab.h:495
[<ffffffff832262a7>] ep_write_iter+0x167/0xb50
drivers/usb/gadget/legacy/inode.c:664
[< inline >] new_sync_write fs/read_write.c:499
[<ffffffff817fdcd3>] __vfs_write+0x483/0x760 fs/read_write.c:512
[<ffffffff817ff720>] vfs_write+0x170/0x4e0 fs/read_write.c:560
[< inline >] SYSC_write fs/read_write.c:607
[<ffffffff81803b2b>] SyS_write+0xfb/0x230 fs/read_write.c:599
[<ffffffff84f47ec1>] entry_SYSCALL_64_fastpath+0x1f/0xc2
Dumping ftrace buffer:
(ftrace buffer empty)
Kernel Offset: disabled


2016-12-12 20:32:40

by Andrey Konovalov

[permalink] [raw]
Subject: Re: usb/gadget: warning in ep_write_iter/__alloc_pages_nodemask

On Mon, Dec 12, 2016 at 9:31 PM, Andrey Konovalov <[email protected]> wrote:
> Hi!
>
> While running the syzkaller fuzzer I've got the following error report.
>
> The issue is that the len argument is not checked for being too big.
>
> WARNING: CPU: 1 PID: 9935 at mm/page_alloc.c:3511
> __alloc_pages_nodemask+0x159c/0x1e20
> Kernel panic - not syncing: panic_on_warn set ...
>
> CPU: 1 PID: 9935 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #34
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> ffff88006949f2c8 ffffffff81f96b8a ffffffff00000200 1ffff1000d293dec
> ffffed000d293de4 0000000000000a06 0000000041b58ab3 ffffffff8598b510
> ffffffff81f968f8 0000000041b58ab3 ffffffff85942a58 ffffffff81432860
> Call Trace:
> [< inline >] __dump_stack lib/dump_stack.c:15
> [<ffffffff81f96b8a>] dump_stack+0x292/0x398 lib/dump_stack.c:51
> [<ffffffff8168c88e>] panic+0x1cb/0x3a9 kernel/panic.c:179
> [<ffffffff812b80b4>] __warn+0x1c4/0x1e0 kernel/panic.c:542
> [<ffffffff812b831c>] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585
> [< inline >] __alloc_pages_slowpath mm/page_alloc.c:3511
> [<ffffffff816c08ac>] __alloc_pages_nodemask+0x159c/0x1e20 mm/page_alloc.c:3781
> [<ffffffff817cde17>] alloc_pages_current+0x1c7/0x6b0 mm/mempolicy.c:2072
> [< inline >] alloc_pages include/linux/gfp.h:469
> [<ffffffff8172fd8f>] kmalloc_order+0x1f/0x70 mm/slab_common.c:1015
> [<ffffffff8172fdff>] kmalloc_order_trace+0x1f/0x160 mm/slab_common.c:1026
> [< inline >] kmalloc_large include/linux/slab.h:422
> [<ffffffff817e01f0>] __kmalloc+0x210/0x2d0 mm/slub.c:3723
> [< inline >] kmalloc include/linux/slab.h:495
> [<ffffffff832262a7>] ep_write_iter+0x167/0xb50
> drivers/usb/gadget/legacy/inode.c:664
> [< inline >] new_sync_write fs/read_write.c:499
> [<ffffffff817fdcd3>] __vfs_write+0x483/0x760 fs/read_write.c:512
> [<ffffffff817ff720>] vfs_write+0x170/0x4e0 fs/read_write.c:560
> [< inline >] SYSC_write fs/read_write.c:607
> [<ffffffff81803b2b>] SyS_write+0xfb/0x230 fs/read_write.c:599
> [<ffffffff84f47ec1>] entry_SYSCALL_64_fastpath+0x1f/0xc2
> Dumping ftrace buffer:
> (ftrace buffer empty)
> Kernel Offset: disabled

+syzkaller

2016-12-12 21:00:30

by Alan Stern

[permalink] [raw]
Subject: Re: usb/gadget: warning in ep_write_iter/__alloc_pages_nodemask

On Mon, 12 Dec 2016, Andrey Konovalov wrote:

> On Mon, Dec 12, 2016 at 9:31 PM, Andrey Konovalov <[email protected]> wrote:
> > Hi!
> >
> > While running the syzkaller fuzzer I've got the following error report.
> >
> > The issue is that the len argument is not checked for being too big.
> >
> > WARNING: CPU: 1 PID: 9935 at mm/page_alloc.c:3511
> > __alloc_pages_nodemask+0x159c/0x1e20
> > Kernel panic - not syncing: panic_on_warn set ...
> >
> > CPU: 1 PID: 9935 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #34
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> > ffff88006949f2c8 ffffffff81f96b8a ffffffff00000200 1ffff1000d293dec
> > ffffed000d293de4 0000000000000a06 0000000041b58ab3 ffffffff8598b510
> > ffffffff81f968f8 0000000041b58ab3 ffffffff85942a58 ffffffff81432860
> > Call Trace:
> > [< inline >] __dump_stack lib/dump_stack.c:15
> > [<ffffffff81f96b8a>] dump_stack+0x292/0x398 lib/dump_stack.c:51
> > [<ffffffff8168c88e>] panic+0x1cb/0x3a9 kernel/panic.c:179
> > [<ffffffff812b80b4>] __warn+0x1c4/0x1e0 kernel/panic.c:542
> > [<ffffffff812b831c>] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585
> > [< inline >] __alloc_pages_slowpath mm/page_alloc.c:3511
> > [<ffffffff816c08ac>] __alloc_pages_nodemask+0x159c/0x1e20 mm/page_alloc.c:3781
> > [<ffffffff817cde17>] alloc_pages_current+0x1c7/0x6b0 mm/mempolicy.c:2072
> > [< inline >] alloc_pages include/linux/gfp.h:469
> > [<ffffffff8172fd8f>] kmalloc_order+0x1f/0x70 mm/slab_common.c:1015
> > [<ffffffff8172fdff>] kmalloc_order_trace+0x1f/0x160 mm/slab_common.c:1026
> > [< inline >] kmalloc_large include/linux/slab.h:422
> > [<ffffffff817e01f0>] __kmalloc+0x210/0x2d0 mm/slub.c:3723
> > [< inline >] kmalloc include/linux/slab.h:495
> > [<ffffffff832262a7>] ep_write_iter+0x167/0xb50
> > drivers/usb/gadget/legacy/inode.c:664
> > [< inline >] new_sync_write fs/read_write.c:499
> > [<ffffffff817fdcd3>] __vfs_write+0x483/0x760 fs/read_write.c:512
> > [<ffffffff817ff720>] vfs_write+0x170/0x4e0 fs/read_write.c:560
> > [< inline >] SYSC_write fs/read_write.c:607
> > [<ffffffff81803b2b>] SyS_write+0xfb/0x230 fs/read_write.c:599
> > [<ffffffff84f47ec1>] entry_SYSCALL_64_fastpath+0x1f/0xc2
> > Dumping ftrace buffer:
> > (ftrace buffer empty)
> > Kernel Offset: disabled

I'm not an expert in this area, but it seems like length checking of
I/O operations should be done in a more central location, like the
VFS, rather than in a million different drivers.

Anyway, it's not a big deal if the memory allocation fails. Users who
try to transfer large amounts of data at once should expect that
sometimes it won't work.

Alan Stern

2016-12-12 21:05:09

by Michal Hocko

[permalink] [raw]
Subject: Re: usb/gadget: warning in ep_write_iter/__alloc_pages_nodemask

On Mon 12-12-16 21:32:35, Andrey Konovalov wrote:
> On Mon, Dec 12, 2016 at 9:31 PM, Andrey Konovalov <[email protected]> wrote:
> > Hi!
> >
> > While running the syzkaller fuzzer I've got the following error report.
> >
> > The issue is that the len argument is not checked for being too big.

Well, the value is checked in kmalloc_slab. There is a discrepancy
though. While the page allocator enforces order < MAX_ORDER, slab
allocators enforce size <= KMALLOC_MAX_SIZE and

KMALLOC_MAX_SIZE is 1UL << (MAX_ORDER + PAGE_SHIFT) for SLUB which is
what I expect you have or 1UL << (MAX_ORDER + PAGE_SHIFT - 1) for SLAB
on most archs. This means that KMALLOC_MAX_SIZE might be MAX_ORDER for
SLUB which would explain your warning.

Let's CC Christoph, this is nothing really new, but I suspect that SLUB
should change the maximum size to something like SLAB does.

That being said, what ep_write_iter does sounds quite stupit. It just
allocates a large continuous buffer which seems to be under user
control... Aka no good! It should do that per pages or something like
that. Something worth fixing

> > WARNING: CPU: 1 PID: 9935 at mm/page_alloc.c:3511
> > __alloc_pages_nodemask+0x159c/0x1e20
> > Kernel panic - not syncing: panic_on_warn set ...
> >
> > CPU: 1 PID: 9935 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #34
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> > ffff88006949f2c8 ffffffff81f96b8a ffffffff00000200 1ffff1000d293dec
> > ffffed000d293de4 0000000000000a06 0000000041b58ab3 ffffffff8598b510
> > ffffffff81f968f8 0000000041b58ab3 ffffffff85942a58 ffffffff81432860
> > Call Trace:
> > [< inline >] __dump_stack lib/dump_stack.c:15
> > [<ffffffff81f96b8a>] dump_stack+0x292/0x398 lib/dump_stack.c:51
> > [<ffffffff8168c88e>] panic+0x1cb/0x3a9 kernel/panic.c:179
> > [<ffffffff812b80b4>] __warn+0x1c4/0x1e0 kernel/panic.c:542
> > [<ffffffff812b831c>] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585
> > [< inline >] __alloc_pages_slowpath mm/page_alloc.c:3511
> > [<ffffffff816c08ac>] __alloc_pages_nodemask+0x159c/0x1e20 mm/page_alloc.c:3781
> > [<ffffffff817cde17>] alloc_pages_current+0x1c7/0x6b0 mm/mempolicy.c:2072
> > [< inline >] alloc_pages include/linux/gfp.h:469
> > [<ffffffff8172fd8f>] kmalloc_order+0x1f/0x70 mm/slab_common.c:1015
> > [<ffffffff8172fdff>] kmalloc_order_trace+0x1f/0x160 mm/slab_common.c:1026
> > [< inline >] kmalloc_large include/linux/slab.h:422
> > [<ffffffff817e01f0>] __kmalloc+0x210/0x2d0 mm/slub.c:3723
> > [< inline >] kmalloc include/linux/slab.h:495
> > [<ffffffff832262a7>] ep_write_iter+0x167/0xb50
> > drivers/usb/gadget/legacy/inode.c:664
> > [< inline >] new_sync_write fs/read_write.c:499
> > [<ffffffff817fdcd3>] __vfs_write+0x483/0x760 fs/read_write.c:512
> > [<ffffffff817ff720>] vfs_write+0x170/0x4e0 fs/read_write.c:560
> > [< inline >] SYSC_write fs/read_write.c:607
> > [<ffffffff81803b2b>] SyS_write+0xfb/0x230 fs/read_write.c:599
> > [<ffffffff84f47ec1>] entry_SYSCALL_64_fastpath+0x1f/0xc2
> > Dumping ftrace buffer:
> > (ftrace buffer empty)
> > Kernel Offset: disabled
>
> +syzkaller

--
Michal Hocko
SUSE Labs

2016-12-12 21:12:19

by Alan Stern

[permalink] [raw]
Subject: Re: usb/gadget: warning in ep_write_iter/__alloc_pages_nodemask

On Mon, 12 Dec 2016, Michal Hocko wrote:

> On Mon 12-12-16 21:32:35, Andrey Konovalov wrote:
> > On Mon, Dec 12, 2016 at 9:31 PM, Andrey Konovalov <[email protected]> wrote:
> > > Hi!
> > >
> > > While running the syzkaller fuzzer I've got the following error report.
> > >
> > > The issue is that the len argument is not checked for being too big.
>
> Well, the value is checked in kmalloc_slab. There is a discrepancy
> though. While the page allocator enforces order < MAX_ORDER, slab
> allocators enforce size <= KMALLOC_MAX_SIZE and
>
> KMALLOC_MAX_SIZE is 1UL << (MAX_ORDER + PAGE_SHIFT) for SLUB which is
> what I expect you have or 1UL << (MAX_ORDER + PAGE_SHIFT - 1) for SLAB
> on most archs. This means that KMALLOC_MAX_SIZE might be MAX_ORDER for
> SLUB which would explain your warning.
>
> Let's CC Christoph, this is nothing really new, but I suspect that SLUB
> should change the maximum size to something like SLAB does.
>
> That being said, what ep_write_iter does sounds quite stupit. It just
> allocates a large continuous buffer which seems to be under user
> control... Aka no good! It should do that per pages or something like
> that. Something worth fixing

It's not important enough to make the driver do all this work. If
users want to send large amounts of data, they can send it a page at a
time (or something like that).

If you really want to prevent the driver from attempting to allocate a
large buffer, all that's needed is an upper limit on the total size.
For example, 64 KB.

Alan Stern

2016-12-13 08:04:31

by Michal Hocko

[permalink] [raw]
Subject: Re: usb/gadget: warning in ep_write_iter/__alloc_pages_nodemask

On Mon 12-12-16 16:12:16, Alan Stern wrote:
> On Mon, 12 Dec 2016, Michal Hocko wrote:
>
> > On Mon 12-12-16 21:32:35, Andrey Konovalov wrote:
> > > On Mon, Dec 12, 2016 at 9:31 PM, Andrey Konovalov <[email protected]> wrote:
> > > > Hi!
> > > >
> > > > While running the syzkaller fuzzer I've got the following error report.
> > > >
> > > > The issue is that the len argument is not checked for being too big.
> >
> > Well, the value is checked in kmalloc_slab. There is a discrepancy
> > though. While the page allocator enforces order < MAX_ORDER, slab
> > allocators enforce size <= KMALLOC_MAX_SIZE and
> >
> > KMALLOC_MAX_SIZE is 1UL << (MAX_ORDER + PAGE_SHIFT) for SLUB which is
> > what I expect you have or 1UL << (MAX_ORDER + PAGE_SHIFT - 1) for SLAB
> > on most archs. This means that KMALLOC_MAX_SIZE might be MAX_ORDER for
> > SLUB which would explain your warning.
> >
> > Let's CC Christoph, this is nothing really new, but I suspect that SLUB
> > should change the maximum size to something like SLAB does.
> >
> > That being said, what ep_write_iter does sounds quite stupit. It just
> > allocates a large continuous buffer which seems to be under user
> > control... Aka no good! It should do that per pages or something like
> > that. Something worth fixing
>
> It's not important enough to make the driver do all this work. If
> users want to send large amounts of data, they can send it a page at a
> time (or something like that).

Is it really necessary to allocate the full iov_iter_count? Why cannot
we process the from buffer one page at a time?

> If you really want to prevent the driver from attempting to allocate a
> large buffer, all that's needed is an upper limit on the total size.
> For example, 64 KB.

Well, my point was that it is not really hard to imagine to deplete
larger contiguous memory blocks (say PAGE_ALLOC_COSTLY_ORDER). Those are
still causing the OOM killer and chances are that a controlled flood of
these requests could completely DoS the system.
--
Michal Hocko
SUSE Labs

2016-12-13 13:33:36

by Alan Stern

[permalink] [raw]
Subject: Re: usb/gadget: warning in ep_write_iter/__alloc_pages_nodemask

On Tue, 13 Dec 2016, Michal Hocko wrote:

> > > That being said, what ep_write_iter does sounds quite stupit. It just
> > > allocates a large continuous buffer which seems to be under user
> > > control... Aka no good! It should do that per pages or something like
> > > that. Something worth fixing
> >
> > It's not important enough to make the driver do all this work. If
> > users want to send large amounts of data, they can send it a page at a
> > time (or something like that).
>
> Is it really necessary to allocate the full iov_iter_count? Why cannot
> we process the from buffer one page at a time?

We could (although one page is really too small -- USB 3.1 can transfer
800 KB per ms so we ought to handle at least 128 KB at a time). But
turn the argument around: If the user wants to transfer that much data,
why can't he _submit_ it one page at a time?

> > If you really want to prevent the driver from attempting to allocate a
> > large buffer, all that's needed is an upper limit on the total size.
> > For example, 64 KB.
>
> Well, my point was that it is not really hard to imagine to deplete
> larger contiguous memory blocks (say PAGE_ALLOC_COSTLY_ORDER). Those are
> still causing the OOM killer and chances are that a controlled flood of
> these requests could completely DoS the system.

Putting a limit on the total size of a single transfer would prevent
this.

Alan Stern

2016-12-14 09:10:12

by Michal Hocko

[permalink] [raw]
Subject: Re: usb/gadget: warning in ep_write_iter/__alloc_pages_nodemask

On Tue 13-12-16 08:33:34, Alan Stern wrote:
> On Tue, 13 Dec 2016, Michal Hocko wrote:
>
> > > > That being said, what ep_write_iter does sounds quite stupit. It just
> > > > allocates a large continuous buffer which seems to be under user
> > > > control... Aka no good! It should do that per pages or something like
> > > > that. Something worth fixing
> > >
> > > It's not important enough to make the driver do all this work. If
> > > users want to send large amounts of data, they can send it a page at a
> > > time (or something like that).
> >
> > Is it really necessary to allocate the full iov_iter_count? Why cannot
> > we process the from buffer one page at a time?
>
> We could (although one page is really too small -- USB 3.1 can transfer
> 800 KB per ms so we ought to handle at least 128 KB at a time).

Is there any problem to submit larger transfers without having the
buffer physically contiguous?

> But
> turn the argument around: If the user wants to transfer that much data,
> why can't he _submit_ it one page at a time?

Not sure I understand.

> > > If you really want to prevent the driver from attempting to allocate a
> > > large buffer, all that's needed is an upper limit on the total size.
> > > For example, 64 KB.
> >
> > Well, my point was that it is not really hard to imagine to deplete
> > larger contiguous memory blocks (say PAGE_ALLOC_COSTLY_ORDER). Those are
> > still causing the OOM killer and chances are that a controlled flood of
> > these requests could completely DoS the system.
>
> Putting a limit on the total size of a single transfer would prevent
> this.

Dunno, putting a limit to the user visible interface sounds wrong to me.
--
Michal Hocko
SUSE Labs

2016-12-14 16:18:29

by Michal Hocko

[permalink] [raw]
Subject: Re: usb/gadget: warning in ep_write_iter/__alloc_pages_nodemask

On Wed 14-12-16 11:13:11, Alan Stern wrote:
> On Wed, 14 Dec 2016, Michal Hocko wrote:
>
> > On Tue 13-12-16 08:33:34, Alan Stern wrote:
> > > On Tue, 13 Dec 2016, Michal Hocko wrote:
[...]
> > > > Well, my point was that it is not really hard to imagine to deplete
> > > > larger contiguous memory blocks (say PAGE_ALLOC_COSTLY_ORDER). Those are
> > > > still causing the OOM killer and chances are that a controlled flood of
> > > > these requests could completely DoS the system.
> > >
> > > Putting a limit on the total size of a single transfer would prevent
> > > this.
> >
> > Dunno, putting a limit to the user visible interface sounds wrong to me.
>
> In practice, I think the data transfer sizes tend to be not very large.
> But I could be wrong about that.

That is one part the other is whether a malicious user can abuse this to
DoS the kernel which is the point I am trying to make here. Depleting
non-costly high orders can be quite dangerious so allowing a free ticket
to them to arbitrary user in an arbitrary amount is definitely not good.
--
Michal Hocko
SUSE Labs

2016-12-14 16:41:20

by Alan Stern

[permalink] [raw]
Subject: Re: usb/gadget: warning in ep_write_iter/__alloc_pages_nodemask

On Wed, 14 Dec 2016, Michal Hocko wrote:

> On Tue 13-12-16 08:33:34, Alan Stern wrote:
> > On Tue, 13 Dec 2016, Michal Hocko wrote:
> >
> > > > > That being said, what ep_write_iter does sounds quite stupit. It just
> > > > > allocates a large continuous buffer which seems to be under user
> > > > > control... Aka no good! It should do that per pages or something like
> > > > > that. Something worth fixing
> > > >
> > > > It's not important enough to make the driver do all this work. If
> > > > users want to send large amounts of data, they can send it a page at a
> > > > time (or something like that).
> > >
> > > Is it really necessary to allocate the full iov_iter_count? Why cannot
> > > we process the from buffer one page at a time?
> >
> > We could (although one page is really too small -- USB 3.1 can transfer
> > 800 KB per ms so we ought to handle at least 128 KB at a time).
>
> Is there any problem to submit larger transfers without having the
> buffer physically contiguous?

Async I/O would be rather awkward; it would have to use a work queue
routine. But it could be done.

And we would still end up allocating the same total space (more
actually, because we would need to store the scatter-gather table too).
It just wouldn't be contiguous.

> > But
> > turn the argument around: If the user wants to transfer that much data,
> > why can't he _submit_ it one page at a time?
>
> Not sure I understand.
>
> > > > If you really want to prevent the driver from attempting to allocate a
> > > > large buffer, all that's needed is an upper limit on the total size.
> > > > For example, 64 KB.
> > >
> > > Well, my point was that it is not really hard to imagine to deplete
> > > larger contiguous memory blocks (say PAGE_ALLOC_COSTLY_ORDER). Those are
> > > still causing the OOM killer and chances are that a controlled flood of
> > > these requests could completely DoS the system.
> >
> > Putting a limit on the total size of a single transfer would prevent
> > this.
>
> Dunno, putting a limit to the user visible interface sounds wrong to me.

In practice, I think the data transfer sizes tend to be not very large.
But I could be wrong about that.

Alan Stern