2024-01-01 13:08:58

by Harshit Mogalapalli

[permalink] [raw]
Subject: [RFC PATCH] VMCI: Silence memcpy() run-time false positive warning

Syzkaller hit 'WARNING in dg_dispatch_as_host' bug.

memcpy: detected field-spanning write (size 56) of single field "&dg_info->msg"
at drivers/misc/vmw_vmci/vmci_datagram.c:237 (size 24)

WARNING: CPU: 0 PID: 1555 at drivers/misc/vmw_vmci/vmci_datagram.c:237
dg_dispatch_as_host+0x88e/0xa60 drivers/misc/vmw_vmci/vmci_datagram.c:237

Some code commentry, based on my understanding:

544 #define VMCI_DG_SIZE(_dg) (VMCI_DG_HEADERSIZE + (size_t)(_dg)->payload_size)
/// This is 24 + payload_size

memcpy(&dg_info->msg, dg, dg_size);
Destination = dg_info->msg ---> this is a 24 byte
structure(struct vmci_datagram)
Source = dg --> this is a 24 byte structure (struct vmci_datagram)
Size = dg_size = 24 + payload_size


{payload_size = 56-24 =32} -- Syzkaller managed to set payload_size to 32.

35 struct delayed_datagram_info {
36 struct datagram_entry *entry;
37 struct work_struct work;
38 bool in_dg_host_queue;
39 /* msg and msg_payload must be together. */
40 struct vmci_datagram msg;
41 u8 msg_payload[];
42 };

So those extra bytes of payload are copied into msg_payload[], so there
is no bug, but a run time warning is seen while fuzzing with Syzkaller.

One possible way to silence the warning is to split the memcpy() into
two parts -- one -- copying the msg and second taking care of payload.

Reported-by: syzkaller <[email protected]>
Suggested-by: Vegard Nossum <[email protected]>
Signed-off-by: Harshit Mogalapalli <[email protected]>
---
This patch is only tested with the C reproducer, not any testing
specific to driver is done.
---
drivers/misc/vmw_vmci/vmci_datagram.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/vmw_vmci/vmci_datagram.c b/drivers/misc/vmw_vmci/vmci_datagram.c
index f50d22882476..b43661590f56 100644
--- a/drivers/misc/vmw_vmci/vmci_datagram.c
+++ b/drivers/misc/vmw_vmci/vmci_datagram.c
@@ -216,6 +216,7 @@ static int dg_dispatch_as_host(u32 context_id, struct vmci_datagram *dg)
if (dst_entry->run_delayed ||
dg->src.context == VMCI_HOST_CONTEXT_ID) {
struct delayed_datagram_info *dg_info;
+ size_t payload_size = dg_size - VMCI_DG_HEADERSIZE;

if (atomic_add_return(1, &delayed_dg_host_queue_size)
== VMCI_MAX_DELAYED_DG_HOST_QUEUE_SIZE) {
@@ -234,7 +235,8 @@ static int dg_dispatch_as_host(u32 context_id, struct vmci_datagram *dg)

dg_info->in_dg_host_queue = true;
dg_info->entry = dst_entry;
- memcpy(&dg_info->msg, dg, dg_size);
+ memcpy(&dg_info->msg, dg, VMCI_DG_HEADERSIZE);
+ memcpy(&dg_info->msg_payload, dg + 1, payload_size);

INIT_WORK(&dg_info->work, dg_delayed_dispatch);
schedule_work(&dg_info->work);
--
2.42.0



2024-01-01 13:55:34

by Greg KH

[permalink] [raw]
Subject: Re: [RFC PATCH] VMCI: Silence memcpy() run-time false positive warning

On Mon, Jan 01, 2024 at 05:08:28AM -0800, Harshit Mogalapalli wrote:
> Syzkaller hit 'WARNING in dg_dispatch_as_host' bug.
>
> memcpy: detected field-spanning write (size 56) of single field "&dg_info->msg"
> at drivers/misc/vmw_vmci/vmci_datagram.c:237 (size 24)
>
> WARNING: CPU: 0 PID: 1555 at drivers/misc/vmw_vmci/vmci_datagram.c:237
> dg_dispatch_as_host+0x88e/0xa60 drivers/misc/vmw_vmci/vmci_datagram.c:237
>
> Some code commentry, based on my understanding:
>
> 544 #define VMCI_DG_SIZE(_dg) (VMCI_DG_HEADERSIZE + (size_t)(_dg)->payload_size)
> /// This is 24 + payload_size
>
> memcpy(&dg_info->msg, dg, dg_size);
> Destination = dg_info->msg ---> this is a 24 byte
> structure(struct vmci_datagram)
> Source = dg --> this is a 24 byte structure (struct vmci_datagram)
> Size = dg_size = 24 + payload_size
>
>
> {payload_size = 56-24 =32} -- Syzkaller managed to set payload_size to 32.
>
> 35 struct delayed_datagram_info {
> 36 struct datagram_entry *entry;
> 37 struct work_struct work;
> 38 bool in_dg_host_queue;
> 39 /* msg and msg_payload must be together. */
> 40 struct vmci_datagram msg;
> 41 u8 msg_payload[];
> 42 };
>
> So those extra bytes of payload are copied into msg_payload[], so there
> is no bug, but a run time warning is seen while fuzzing with Syzkaller.
>
> One possible way to silence the warning is to split the memcpy() into
> two parts -- one -- copying the msg and second taking care of payload.

And what are the performance impacts of this?

thanks,

greg k-h

2024-01-01 17:45:35

by Gustavo A. R. Silva

[permalink] [raw]
Subject: Re: [RFC PATCH] VMCI: Silence memcpy() run-time false positive warning



On 1/1/24 07:08, Harshit Mogalapalli wrote:
> Syzkaller hit 'WARNING in dg_dispatch_as_host' bug.
>
> memcpy: detected field-spanning write (size 56) of single field "&dg_info->msg"
> at drivers/misc/vmw_vmci/vmci_datagram.c:237 (size 24)

This is not a 'false postive warning.' This is a legitimately warning
coming from the fortified memcpy().

Under FORTIFY_SOURCE we should not copy data across multiple members
in a structure. For that we alternatives like struct_group(), or as
in this case, splitting memcpy(), or as I suggest below, a mix of
direct assignment and memcpy().


>
> WARNING: CPU: 0 PID: 1555 at drivers/misc/vmw_vmci/vmci_datagram.c:237
> dg_dispatch_as_host+0x88e/0xa60 drivers/misc/vmw_vmci/vmci_datagram.c:237
>
> Some code commentry, based on my understanding:
>
> 544 #define VMCI_DG_SIZE(_dg) (VMCI_DG_HEADERSIZE + (size_t)(_dg)->payload_size)
> /// This is 24 + payload_size
>
> memcpy(&dg_info->msg, dg, dg_size);
> Destination = dg_info->msg ---> this is a 24 byte
> structure(struct vmci_datagram)
> Source = dg --> this is a 24 byte structure (struct vmci_datagram)
> Size = dg_size = 24 + payload_size
>
>
> {payload_size = 56-24 =32} -- Syzkaller managed to set payload_size to 32.
>
> 35 struct delayed_datagram_info {
> 36 struct datagram_entry *entry;
> 37 struct work_struct work;
> 38 bool in_dg_host_queue;
> 39 /* msg and msg_payload must be together. */
> 40 struct vmci_datagram msg;
> 41 u8 msg_payload[];
> 42 };
>
> So those extra bytes of payload are copied into msg_payload[], so there
> is no bug, but a run time warning is seen while fuzzing with Syzkaller.
>
> One possible way to silence the warning is to split the memcpy() into
> two parts -- one -- copying the msg and second taking care of payload.
>
> Reported-by: syzkaller <[email protected]>
> Suggested-by: Vegard Nossum <[email protected]>
> Signed-off-by: Harshit Mogalapalli <[email protected]>
> ---
> This patch is only tested with the C reproducer, not any testing
> specific to driver is done.
> ---
> drivers/misc/vmw_vmci/vmci_datagram.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/misc/vmw_vmci/vmci_datagram.c b/drivers/misc/vmw_vmci/vmci_datagram.c
> index f50d22882476..b43661590f56 100644
> --- a/drivers/misc/vmw_vmci/vmci_datagram.c
> +++ b/drivers/misc/vmw_vmci/vmci_datagram.c
> @@ -216,6 +216,7 @@ static int dg_dispatch_as_host(u32 context_id, struct vmci_datagram *dg)
> if (dst_entry->run_delayed ||
> dg->src.context == VMCI_HOST_CONTEXT_ID) {
> struct delayed_datagram_info *dg_info;
> + size_t payload_size = dg_size - VMCI_DG_HEADERSIZE;

This seems to be the same as `dg->payload_size`, so I don't think a new
variable is necessary.

>
> if (atomic_add_return(1, &delayed_dg_host_queue_size)
> == VMCI_MAX_DELAYED_DG_HOST_QUEUE_SIZE) {
> @@ -234,7 +235,8 @@ static int dg_dispatch_as_host(u32 context_id, struct vmci_datagram *dg)
>
> dg_info->in_dg_host_queue = true;
> dg_info->entry = dst_entry;
> - memcpy(&dg_info->msg, dg, dg_size);
> + memcpy(&dg_info->msg, dg, VMCI_DG_HEADERSIZE);
> + memcpy(&dg_info->msg_payload, dg + 1, payload_size);

I think a direct assignment and a call to memcpy() is better in this case,
something like this:

dg_info->msg = *dg;
memcpy(&dg_info->msg_payload, dg + 1, dg->payload_size);

However, that `dg + 1` thing is making my eyes twitch. Where exactly are we
making sure that `dg` actually points to an area in memory bigger than
`sizeof(*dg)`?...

Also, we could also use struct_size() during allocation, some lines above:

- dg_info = kmalloc(sizeof(*dg_info) +
- (size_t) dg->payload_size, GFP_ATOMIC);
+ dg_info = kmalloc(struct_size(dg_info, msg_payload, dg->payload_size),
+ GFP_ATOMIC);

--
Gustavo

>
> INIT_WORK(&dg_info->work, dg_delayed_dispatch);
> schedule_work(&dg_info->work);

2024-01-02 18:35:13

by Harshit Mogalapalli

[permalink] [raw]
Subject: Re: [RFC PATCH] VMCI: Silence memcpy() run-time false positive warning

Hi Greg,

On 01/01/24 7:25 pm, Greg Kroah-Hartman wrote:
> On Mon, Jan 01, 2024 at 05:08:28AM -0800, Harshit Mogalapalli wrote:
>> Syzkaller hit 'WARNING in dg_dispatch_as_host' bug.
>>
>> memcpy: detected field-spanning write (size 56) of single field "&dg_info->msg"
>> at drivers/misc/vmw_vmci/vmci_datagram.c:237 (size 24)
>>
>> WARNING: CPU: 0 PID: 1555 at drivers/misc/vmw_vmci/vmci_datagram.c:237
>> dg_dispatch_as_host+0x88e/0xa60 drivers/misc/vmw_vmci/vmci_datagram.c:237
>>
>> Some code commentry, based on my understanding:
>>
>> 544 #define VMCI_DG_SIZE(_dg) (VMCI_DG_HEADERSIZE + (size_t)(_dg)->payload_size)
>> /// This is 24 + payload_size
>>
>> memcpy(&dg_info->msg, dg, dg_size);
>> Destination = dg_info->msg ---> this is a 24 byte
>> structure(struct vmci_datagram)
>> Source = dg --> this is a 24 byte structure (struct vmci_datagram)
>> Size = dg_size = 24 + payload_size
>>
>>
>> {payload_size = 56-24 =32} -- Syzkaller managed to set payload_size to 32.
>>
>> 35 struct delayed_datagram_info {
>> 36 struct datagram_entry *entry;
>> 37 struct work_struct work;
>> 38 bool in_dg_host_queue;
>> 39 /* msg and msg_payload must be together. */
>> 40 struct vmci_datagram msg;
>> 41 u8 msg_payload[];
>> 42 };
>>
>> So those extra bytes of payload are copied into msg_payload[], so there
>> is no bug, but a run time warning is seen while fuzzing with Syzkaller.
>>
>> One possible way to silence the warning is to split the memcpy() into
>> two parts -- one -- copying the msg and second taking care of payload.
>
> And what are the performance impacts of this?
>

I haven't done any performance tests on this.

I tried to look at the diff in assembly code but couldn't comment on
performance from that. Also, gustavo suggested to do this: instead of
two memcpy()'s; a direct assignment and memcpy() for the payload part.

Is there a way to do perf analysis based on code without access to hardware?

Thanks,
Harshit

> thanks,
>
> greg k-h


2024-01-02 18:38:22

by Harshit Mogalapalli

[permalink] [raw]
Subject: Re: [RFC PATCH] VMCI: Silence memcpy() run-time false positive warning

Hi Gustavo,

On 01/01/24 11:13 pm, Gustavo A. R. Silva wrote:
>
>
> On 1/1/24 07:08, Harshit Mogalapalli wrote:
>> Syzkaller hit 'WARNING in dg_dispatch_as_host' bug.
>>
>> memcpy: detected field-spanning write (size 56) of single field
>> "&dg_info->msg"
>> at drivers/misc/vmw_vmci/vmci_datagram.c:237 (size 24)
>
> This is not a 'false postive warning.' This is a legitimately warning
> coming from the fortified memcpy().
>
> Under FORTIFY_SOURCE we should not copy data across multiple members
> in a structure. For that we alternatives like struct_group(), or as
> in this case, splitting memcpy(), or as I suggest below, a mix of
> direct assignment and memcpy().
>

Thanks for sharing this.
>
>>
>> struct vmci_datagram *dg)
>>           if (dst_entry->run_delayed ||
>>               dg->src.context == VMCI_HOST_CONTEXT_ID) {
>>               struct delayed_datagram_info *dg_info;
>> +            size_t payload_size = dg_size - VMCI_DG_HEADERSIZE;
>
> This seems to be the same as `dg->payload_size`, so I don't think a new
> variable is necessary.
>

Oh right, this is unnecessary. I will remove it.

>>               if (atomic_add_return(1, &delayed_dg_host_queue_size)
>>                   == VMCI_MAX_DELAYED_DG_HOST_QUEUE_SIZE) {
>> @@ -234,7 +235,8 @@ static int dg_dispatch_as_host(u32 context_id,
>> struct vmci_datagram *dg)
>>               dg_info->in_dg_host_queue = true;
>>               dg_info->entry = dst_entry;
>> -            memcpy(&dg_info->msg, dg, dg_size);
>> +            memcpy(&dg_info->msg, dg, VMCI_DG_HEADERSIZE);
>> +            memcpy(&dg_info->msg_payload, dg + 1, payload_size);
>
> I think a direct assignment and a call to memcpy() is better in this case,
> something like this:
>
> dg_info->msg = *dg;
> memcpy(&dg_info->msg_payload, dg + 1, dg->payload_size);
>
> However, that `dg + 1` thing is making my eyes twitch. Where exactly are we
> making sure that `dg` actually points to an area in memory bigger than
> `sizeof(*dg)`?...
>

Going up on the call tree:

-> vmci_transport_dgram_enqueue()
--> vmci_datagram_send()
---> vmci_datagram_dispatch()
----> dg_dispatch_as_host()

1694 static int vmci_transport_dgram_enqueue(
1695 struct vsock_sock *vsk,
1696 struct sockaddr_vm *remote_addr,
1697 struct msghdr *msg,
1698 size_t len)
1699 {
1700 int err;
1701 struct vmci_datagram *dg;
1702
1703 if (len > VMCI_MAX_DG_PAYLOAD_SIZE)
1704 return -EMSGSIZE;
1705
1706 if (!vmci_transport_allow_dgram(vsk, remote_addr->svm_cid))
1707 return -EPERM;
1708
1709 /* Allocate a buffer for the user's message and our packet
header. */
1710 dg = kmalloc(len + sizeof(*dg), GFP_KERNEL);
1711 if (!dg)
1712 return -ENOMEM;

^^^ dg = kmalloc(len + sizeof(*dg), GFP_KERNEL);
I think from this we can say allocated memory for dg is bigger than
sizeof(*dg).


> Also, we could also use struct_size() during allocation, some lines above:
>
> -                       dg_info = kmalloc(sizeof(*dg_info) +
> -                                   (size_t) dg->payload_size, GFP_ATOMIC);
> +                       dg_info = kmalloc(struct_size(dg_info,
> msg_payload, dg->payload_size),
> +                                         GFP_ATOMIC);
>
Thanks again for the suggestion.

I still couldn't figure out the performance comparison before and after
patch. Once I have some reasoning, I will include the above changes and
send a V2.

Thanks,
Harshit
> --
> Gustavo
>
>>               INIT_WORK(&dg_info->work, dg_delayed_dispatch);
>>               schedule_work(&dg_info->work);


2024-01-04 18:32:15

by Vegard Nossum

[permalink] [raw]
Subject: Re: [RFC PATCH] VMCI: Silence memcpy() run-time false positive warning


On 01/01/2024 14:55, Greg Kroah-Hartman wrote:
> On Mon, Jan 01, 2024 at 05:08:28AM -0800, Harshit Mogalapalli wrote:
>> One possible way to silence the warning is to split the memcpy() into
>> two parts -- one -- copying the msg and second taking care of payload.
>
> And what are the performance impacts of this?

I did a disasssembly diff for the version of the patch that uses
dg->payload_size directly in the second memcpy and I get this as the
only change:

@@ -419,11 +419,16 @@
mov %rax,%rbx
test %rax,%rax
je
+ mov 0x0(%rbp),%rdx
mov %r14,(%rax)
- mov %r13,%rdx
- mov %rbp,%rsi
- lea 0x30(%rax),%rdi
+ lea 0x18(%rbp),%rsi
+ lea 0x48(%rax),%rdi
movb $0x1,0x28(%rax)
+ mov %rdx,0x30(%rax)
+ mov 0x8(%rbp),%rdx
+ mov %rdx,0x38(%rax)
+ mov 0x10(%rbp),%rdx
+ mov %rdx,0x40(%rax)
call
mov 0x0(%rip),%rsi #
lea 0x8(%rbx),%rdx

Basically, I believe it's inlining the first constant-size memcpy and
keeping the second one as a call.

Overall, the number of memory accesses should be the same.

The biggest impact that I can see is therefore the code size (which
isn't much).

There is also a kmalloc() on the same code path that I assume would
dwarf any performance impact from this patch -- but happy to be corrected.


Vegard

2024-01-04 19:02:57

by Gustavo A. R. Silva

[permalink] [raw]
Subject: Re: [RFC PATCH] VMCI: Silence memcpy() run-time false positive warning



On 1/4/24 12:31, Vegard Nossum wrote:
>
> On 01/01/2024 14:55, Greg Kroah-Hartman wrote:
>> On Mon, Jan 01, 2024 at 05:08:28AM -0800, Harshit Mogalapalli wrote:
>>> One possible way to silence the warning is to split the memcpy() into
>>> two parts -- one -- copying the msg and second taking care of payload.
>>
>> And what are the performance impacts of this?
>
> I did a disasssembly diff for the version of the patch that uses
> dg->payload_size directly in the second memcpy and I get this as the
> only change:
>
> @@ -419,11 +419,16 @@
>         mov    %rax,%rbx
>         test   %rax,%rax
>         je
> +       mov    0x0(%rbp),%rdx
>         mov    %r14,(%rax)
> -       mov    %r13,%rdx
> -       mov    %rbp,%rsi
> -       lea    0x30(%rax),%rdi
> +       lea    0x18(%rbp),%rsi
> +       lea    0x48(%rax),%rdi
>         movb   $0x1,0x28(%rax)
> +       mov    %rdx,0x30(%rax)
> +       mov    0x8(%rbp),%rdx
> +       mov    %rdx,0x38(%rax)
> +       mov    0x10(%rbp),%rdx
> +       mov    %rdx,0x40(%rax)
>         call
>         mov    0x0(%rip),%rsi        #
>         lea    0x8(%rbx),%rdx
>
> Basically, I believe it's inlining the first constant-size memcpy and
> keeping the second one as a call.
>
> Overall, the number of memory accesses should be the same.
>
> The biggest impact that I can see is therefore the code size (which
> isn't much).

Yep, I don't think this is a problem.

I look forward to reviewing v2 of this patch.

Thanks
--
Gustavo

>
> There is also a kmalloc() on the same code path that I assume would
> dwarf any performance impact from this patch -- but happy to be corrected.
>
>
> Vegard
>