2020-11-23 21:03:19

by Norbert Slusarek

[permalink] [raw]
Subject: [PATCH RESEND] misc/vmw_vmci: bail out earlier on too big queue allocation

From: Norbert Slusarek <[email protected]>
Date: Mon, 23 Nov 2020 21:53:41 +0100
Subject: [PATCH RESEND] misc/vmw_vmci: bail out earlier on too big queue allocation

For the allocation of a queue pair in qp_host_alloc_queue() an arbitrary value
can be passed for produce_size, which can lead to miscalculation of memory we'd
like to allocate in one take. A warning is triggered at __alloc_pages_nodemask()
in mm/page_alloc.c:4737 which aborts the false allocation, but it results in a
VMware machine freezing for an infinite time.

Signed-off-by: Norbert Slusarek <[email protected]>
---

Resend because of email and formatting issues.

To provide an accurate explanation of the problem, I'll describe my observations
and include a PoC you can run yourself. The value for produce_size (0x7ffe7001)
wasn't chosen randomly, it's the least value which can trigger the warning. With
this value, calculations are done in qp_host_alloc_queue() before
calling kzalloc() with queue_size + queue_page_size. The calculation of
queue_size involves taking the size of *queue->kernel_if, which on 5.10-rc4 has
the size of 72 bytes and on 5.4.79 it's 168 bytes. While on 5.10-rc4 the size
argument for kzalloc() won't surpass the maximum value of 4096*1024 for an
individual allocation (for produce_size = 0x7ffe7001 -> kzalloc(4194216)), it
will be greater on the stable 5.4.79 kernel (for produce_size = 0x7ffe7001 ->
kzalloc(4194312)). This will ultimately lead to a warning on the stable 5.4
kernel, but not on the upstream kernel, so ideally my patch would be backported
to stable kernels. Eventhough the warning in __alloc_pages_nodemask() already
aborts the oversized allocation of memory, VMware will hang for an infinite
time, hence I wanted to provide this simple patch. We shouldn't rely on the page
allocator to abort it anyways, it's better to keep it clean and check for a too
large allocation before calling kzalloc(). When I ran the PoC in QEMU and on a
host machine, I didn't experience any freezes at all, but the warning gets
triggered.

PoC (run on 5.4 stable kernel and VMCI driver loaded for /dev/vmci):

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <endian.h>
#include <stdint.h>
#include <string.h>
#include <sys/syscall.h>
#include <sys/types.h>

#define VMADDR_CID_LOCAL 1
#define IOCTL_VMCI_VERSION2 1959
#define IOCTL_VMCI_INIT_CONTEXT 1952
#define IOCTL_VMCI_QUEUEPAIR_ALLOC 1960
#define VMCI_VERSION_PREHOSTQP 0x80000
#define VMCI_NO_PRIVILEGE_FLAGS 0

struct vmci_handle {
unsigned int context;
unsigned int resource;
};

struct vmci_qp_alloc_info {
struct vmci_handle handle;
unsigned int peer;
unsigned int flags;
unsigned long produce_size;
unsigned long consume_size;
unsigned long ppn_va;
unsigned long num_ppns;
int result;
unsigned int version;
};

struct vmci_init_blk {
int cid;
int flags;
};

int main(void)
{
int fd, flag;

fd = syscall(__NR_openat, -100, "/dev/vmci", O_RDWR, 0);

flag = VMCI_VERSION_PREHOSTQP;
syscall(__NR_ioctl, fd, IOCTL_VMCI_VERSION2, &flag);

struct vmci_init_blk cxt = {
.cid = VMADDR_CID_LOCAL,
.flags = VMCI_NO_PRIVILEGE_FLAGS
};
syscall(__NR_ioctl, fd, IOCTL_VMCI_INIT_CONTEXT, &cxt);

struct vmci_qp_alloc_info qp = {
.handle.context = VMADDR_CID_LOCAL,
.handle.resource = 0,
.peer = 0,
.flags = 0,
.produce_size = 0x7ffe7001,
.consume_size = 0,
.ppn_va = 0,
.num_ppns = 0,
.result = -1,
.version = 0
};
syscall(__NR_ioctl, fd, IOCTL_VMCI_QUEUEPAIR_ALLOC, &qp);

return 0;
}

---
drivers/misc/vmw_vmci/vmci_queue_pair.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/misc/vmw_vmci/vmci_queue_pair.c b/drivers/misc/vmw_vmci/vmci_queue_pair.c
index c49065887e8f..997ff32475b2 100644
--- a/drivers/misc/vmw_vmci/vmci_queue_pair.c
+++ b/drivers/misc/vmw_vmci/vmci_queue_pair.c
@@ -526,6 +526,7 @@ static struct vmci_queue *qp_host_alloc_queue(u64 size)
struct vmci_queue *queue;
size_t queue_page_size;
u64 num_pages;
+ unsigned int order;
const size_t queue_size = sizeof(*queue) + sizeof(*(queue->kernel_if));

if (size > SIZE_MAX - PAGE_SIZE)
@@ -537,6 +538,10 @@ static struct vmci_queue *qp_host_alloc_queue(u64 size)

queue_page_size = num_pages * sizeof(*queue->kernel_if->u.h.page);

+ order = get_order(queue_size + queue_page_size);
+ if (order >= MAX_ORDER)
+ return NULL;
+
queue = kzalloc(queue_size + queue_page_size, GFP_KERNEL);
if (queue) {
queue->q_header = NULL;
--
2.29.2


2020-12-01 20:25:26

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH RESEND] misc/vmw_vmci: bail out earlier on too big queue allocation

On Mon, Nov 23, 2020 at 10:01 PM Norbert Slusarek <[email protected]> wrote:
>
> From: Norbert Slusarek <[email protected]>
> Date: Mon, 23 Nov 2020 21:53:41 +0100
> Subject: [PATCH RESEND] misc/vmw_vmci: bail out earlier on too big queue allocation
>
> For the allocation of a queue pair in qp_host_alloc_queue() an arbitrary value
> can be passed for produce_size, which can lead to miscalculation of memory we'd
> like to allocate in one take. A warning is triggered at __alloc_pages_nodemask()
> in mm/page_alloc.c:4737 which aborts the false allocation, but it results in a
> VMware machine freezing for an infinite time.
>
> Signed-off-by: Norbert Slusarek <[email protected]>

Thank you for sending a patch with such a detailed analysis and even
including a test case!

> diff --git a/drivers/misc/vmw_vmci/vmci_queue_pair.c b/drivers/misc/vmw_vmci/vmci_queue_pair.c
> index c49065887e8f..997ff32475b2 100644
> --- a/drivers/misc/vmw_vmci/vmci_queue_pair.c
> +++ b/drivers/misc/vmw_vmci/vmci_queue_pair.c
> @@ -526,6 +526,7 @@ static struct vmci_queue *qp_host_alloc_queue(u64 size)
> struct vmci_queue *queue;
> size_t queue_page_size;
> u64 num_pages;
> + unsigned int order;
> const size_t queue_size = sizeof(*queue) + sizeof(*(queue->kernel_if));
>
> if (size > SIZE_MAX - PAGE_SIZE)
> @@ -537,6 +538,10 @@ static struct vmci_queue *qp_host_alloc_queue(u64 size)
>
> queue_page_size = num_pages * sizeof(*queue->kernel_if->u.h.page);
>
> + order = get_order(queue_size + queue_page_size);
> + if (order >= MAX_ORDER)
> + return NULL;
> +
> queue = kzalloc(queue_size + queue_page_size, GFP_KERNEL);

Calling kzalloc() with that user-provided size may still not be a great
idea, and MAX_ORDER is probably more than anyone ever needs here
(I don't know what the interface is needed for, but usually it is).

If there is a reasonable upper bound smaller than MAX_ORDER, I
would suggest using that. It might also be helpful to use kvzalloc()/kvfree()
in this case, which can return an arbitrary number of pages
and suffers less from fragmentation.

Note that both kzalloc() and kvzalloc() will fail for much smaller
sizes if the kernel is low on memory, so that might have the same
problem.

Arnd

2020-12-02 10:11:22

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [PATCH RESEND] misc/vmw_vmci: bail out earlier on too big queue allocation

On Tue, Dec 1, 2020 at 9:21 PM Arnd Bergmann <[email protected]> wrote:
>
> On Mon, Nov 23, 2020 at 10:01 PM Norbert Slusarek <[email protected]> wrote:
> >
> > From: Norbert Slusarek <[email protected]>
> > Date: Mon, 23 Nov 2020 21:53:41 +0100
> > Subject: [PATCH RESEND] misc/vmw_vmci: bail out earlier on too big queue allocation
> >
> > For the allocation of a queue pair in qp_host_alloc_queue() an arbitrary value
> > can be passed for produce_size, which can lead to miscalculation of memory we'd
> > like to allocate in one take. A warning is triggered at __alloc_pages_nodemask()
> > in mm/page_alloc.c:4737 which aborts the false allocation, but it results in a
> > VMware machine freezing for an infinite time.
> >
> > Signed-off-by: Norbert Slusarek <[email protected]>
>
> Thank you for sending a patch with such a detailed analysis and even
> including a test case!

Yeah agree, very good details!

>
> > diff --git a/drivers/misc/vmw_vmci/vmci_queue_pair.c b/drivers/misc/vmw_vmci/vmci_queue_pair.c
> > index c49065887e8f..997ff32475b2 100644
> > --- a/drivers/misc/vmw_vmci/vmci_queue_pair.c
> > +++ b/drivers/misc/vmw_vmci/vmci_queue_pair.c
> > @@ -526,6 +526,7 @@ static struct vmci_queue *qp_host_alloc_queue(u64 size)
> > struct vmci_queue *queue;
> > size_t queue_page_size;
> > u64 num_pages;
> > + unsigned int order;
> > const size_t queue_size = sizeof(*queue) + sizeof(*(queue->kernel_if));
> >
> > if (size > SIZE_MAX - PAGE_SIZE)
> > @@ -537,6 +538,10 @@ static struct vmci_queue *qp_host_alloc_queue(u64 size)
> >
> > queue_page_size = num_pages * sizeof(*queue->kernel_if->u.h.page);
> >
> > + order = get_order(queue_size + queue_page_size);
> > + if (order >= MAX_ORDER)
> > + return NULL;
> > +
> > queue = kzalloc(queue_size + queue_page_size, GFP_KERNEL);
>
> Calling kzalloc() with that user-provided size may still not be a great
> idea, and MAX_ORDER is probably more than anyone ever needs here
> (I don't know what the interface is needed for, but usually it is).
>
> If there is a reasonable upper bound smaller than MAX_ORDER, I
> would suggest using that. It might also be helpful to use kvzalloc()/kvfree()
> in this case, which can return an arbitrary number of pages
> and suffers less from fragmentation.

I don't know well vmw_vmci, but I took a brief look, and I saw that
there is a macro (VMCI_MAX_GUEST_QP_MEMORY) used in vmci_qpair_alloc(),
I'm not sure if we can use the same macro, but maybe something similar.

Honestly I don't know where that limit comes from, maybe it was chosen
as an arbitrary and reasonable value but I suggest to check if we can
reuse the same macro here with some adjustments.
I think in some way that limit is related to the max memory that the
host should allocate.

@Jorgen any thought?

Thanks,
Stefano

>
> Note that both kzalloc() and kvzalloc() will fail for much smaller
> sizes if the kernel is low on memory, so that might have the same
> problem.
>
> Arnd
>

2020-12-07 14:43:31

by Jorgen Hansen

[permalink] [raw]
Subject: RE: [PATCH RESEND] misc/vmw_vmci: bail out earlier on too big queue allocation

From: Stefano Garzarella <[email protected]>
Sent: Wednesday, December 2, 2020 2:06 AM
>
> On Tue, Dec 1, 2020 at 9:21 PM Arnd Bergmann <[email protected]> wrote:
> >
> > On Mon, Nov 23, 2020 at 10:01 PM Norbert Slusarek <[email protected]>
> wrote:
> > >
> > > From: Norbert Slusarek <[email protected]>
> > > Date: Mon, 23 Nov 2020 21:53:41 +0100
> > > Subject: [PATCH RESEND] misc/vmw_vmci: bail out earlier on too big
> > > queue allocation
> > >
> > > For the allocation of a queue pair in qp_host_alloc_queue() an
> > > arbitrary value can be passed for produce_size, which can lead to
> > > miscalculation of memory we'd like to allocate in one take. A
> > > warning is triggered at __alloc_pages_nodemask() in
> > > mm/page_alloc.c:4737 which aborts the false allocation, but it results in a
> VMware machine freezing for an infinite time.
> > >
> > > Signed-off-by: Norbert Slusarek <[email protected]>
> >
> > Thank you for sending a patch with such a detailed analysis and even
> > including a test case!
>
> Yeah agree, very good details!

Yes, thanks a lot for pointing this out.

> >
> > > diff --git a/drivers/misc/vmw_vmci/vmci_queue_pair.c
> > > b/drivers/misc/vmw_vmci/vmci_queue_pair.c
> > > index c49065887e8f..997ff32475b2 100644
> > > --- a/drivers/misc/vmw_vmci/vmci_queue_pair.c
> > > +++ b/drivers/misc/vmw_vmci/vmci_queue_pair.c
> > > @@ -526,6 +526,7 @@ static struct vmci_queue
> *qp_host_alloc_queue(u64 size)
> > > struct vmci_queue *queue;
> > > size_t queue_page_size;
> > > u64 num_pages;
> > > + unsigned int order;
> > > const size_t queue_size = sizeof(*queue) +
> > > sizeof(*(queue->kernel_if));
> > >
> > > if (size > SIZE_MAX - PAGE_SIZE) @@ -537,6 +538,10 @@ static
> > > struct vmci_queue *qp_host_alloc_queue(u64 size)
> > >
> > > queue_page_size = num_pages *
> > > sizeof(*queue->kernel_if->u.h.page);
> > >
> > > + order = get_order(queue_size + queue_page_size);
> > > + if (order >= MAX_ORDER)
> > > + return NULL;
> > > +
> > > queue = kzalloc(queue_size + queue_page_size, GFP_KERNEL);
> >
> > Calling kzalloc() with that user-provided size may still not be a
> > great idea, and MAX_ORDER is probably more than anyone ever needs
> here
> > (I don't know what the interface is needed for, but usually it is).
> >
> > If there is a reasonable upper bound smaller than MAX_ORDER, I would
> > suggest using that. It might also be helpful to use
> > kvzalloc()/kvfree() in this case, which can return an arbitrary number
> > of pages and suffers less from fragmentation.
>
> I don't know well vmw_vmci, but I took a brief look, and I saw that there is a
> macro (VMCI_MAX_GUEST_QP_MEMORY) used in vmci_qpair_alloc(), I'm
> not sure if we can use the same macro, but maybe something similar.
>
> Honestly I don't know where that limit comes from, maybe it was chosen as
> an arbitrary and reasonable value but I suggest to check if we can reuse the
> same macro here with some adjustments.
> I think in some way that limit is related to the max memory that the host
> should allocate.
>
> @Jorgen any thought?

The VMCI_MAX_GUEST_QP_MEMORY limit is the limit for the total amount of guest memory that can be used for VMCI queue pairs with the current revision of the VMCI virtual device, so it is the upper limit on the size of a single queue pair as well. In this function, the size parameter indicates the desired queue size of one of the queues in the queue pair, so they should be bounded by this as well. The appropriate check in the beginning of the function would be

if (size > min(VMCI_MAX_GUEST_QP_MEMORY, SIZE_MAX - PAGE_SIZE))
return NULL;

Since VMCI_MAX_GUEST_QP_MEMORY is 128MB, the actual kzalloc argument should be well below the limit imposed by MAX_ORDER, with the modification of the above check.

That this unchecked value can occur here at all is due to another bug, where the queue sizes aren't validated against the VMCI_MAX_GUEST_QP_MEMORY limit on the vmci_qp_broker_alloc callpath. We'll fix that and we can update the above check as well in the same change.

Thanks,
J?rgen