2023-02-27 20:24:49

by Jacob Keller

[permalink] [raw]
Subject: [PATCH] coccinelle: semantic patch to check for potential struct_size calls

include/linux/overflow.h includes helper macros intended for calculating
sizes of allocations. These macros prevent accidental overflow by
saturating at SIZE_MAX.

In general when calculating such sizes use of the macros is preferred. Add
a semantic patch which can detect code patterns which can be replaced by
struct_size.

Note that I set the confidence to medium because this patch doesn't make an
attempt to ensure that the relevant array is actually a flexible array. The
struct_size macro does specifically require a flexible array. In many cases
the detected code could be refactored to a flexible array, but this is not
always possible (such as if there are multiple over-allocations).

Signed-off-by: Jacob Keller <[email protected]>
Cc: Julia Lawall <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Gustavo A. R. Silva <[email protected]>
Cc: [email protected]
Cc: [email protected]

scripts/coccinelle/misc/struct_size.cocci | 74 +++++++++++++++++++++++
1 file changed, 74 insertions(+)
create mode 100644 scripts/coccinelle/misc/struct_size.cocci

diff --git a/scripts/coccinelle/misc/struct_size.cocci b/scripts/coccinelle/misc/struct_size.cocci
new file mode 100644
index 000000000000..4ede9586e3c6
--- /dev/null
+++ b/scripts/coccinelle/misc/struct_size.cocci
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0-only
+///
+/// Check for code that could use struct_size().
+///
+// Confidence: Medium
+// Author: Jacob Keller <[email protected]>
+// Copyright: (C) 2023 Intel Corporation
+// Options: --no-includes --include-headers
+
+virtual patch
+virtual context
+virtual org
+virtual report
+
+// the overflow Kunit tests have some code which intentionally does not use
+// the macros, so we want to ignore this code when reporting potential
+// issues.
+@overflow_tests@
+identifier f = overflow_size_helpers_test;
+@@
+
+f
+
+//----------------------------------------------------------
+// For context mode
+//----------------------------------------------------------
+
+@depends on !overflow_tests && context@
+expression E1, E2;
+identifier m;
+@@
+(
+* (sizeof(*E1) + (E2 * sizeof(*E1->m)))
+)
+
+//----------------------------------------------------------
+// For patch mode
+//----------------------------------------------------------
+
+@depends on !overflow_tests && patch@
+expression E1, E2;
+identifier m;
+@@
+(
+- (sizeof(*E1) + (E2 * sizeof(*E1->m)))
++ struct_size(E1, m, E2)
+)
+
+//----------------------------------------------------------
+// For org and report mode
+//----------------------------------------------------------
+
+@r depends on !overflow_tests && (org || report)@
+expression E1, E2;
+identifier m;
+position p;
+@@
+(
+ (sizeof(*E1)@p + (E2 * sizeof(*E1->m)))
+)
+
+@script:python depends on org@
+p << r.p;
+@@
+
+coccilib.org.print_todo(p[0], "WARNING should use struct_size")
+
+@script:python depends on report@
+p << r.p;
+@@
+
+msg="WARNING: Use struct_size"
+coccilib.report.print_report(p[0], msg)
+

base-commit: 982818426a0ffaf93b0621826ed39a84be3d7d62
--
2.39.1.405.gd4c25cc71f83



2023-08-29 20:12:39

by Jacob Keller

[permalink] [raw]
Subject: Re: [PATCH] coccinelle: semantic patch to check for potential struct_size calls



On 8/26/2023 6:19 PM, Kees Cook wrote:
> Hi!
>
> I'm sorry I lost this email! I just found it while trying to clean up
> my inbox.
>
> On Mon, Feb 27, 2023 at 12:24:28PM -0800, Jacob Keller wrote:
>> include/linux/overflow.h includes helper macros intended for calculating
>> sizes of allocations. These macros prevent accidental overflow by
>> saturating at SIZE_MAX.
>>
>> In general when calculating such sizes use of the macros is preferred. Add
>> a semantic patch which can detect code patterns which can be replaced by
>> struct_size.
>>
>> Note that I set the confidence to medium because this patch doesn't make an
>> attempt to ensure that the relevant array is actually a flexible array. The
>> struct_size macro does specifically require a flexible array. In many cases
>> the detected code could be refactored to a flexible array, but this is not
>> always possible (such as if there are multiple over-allocations).
>>
>> Signed-off-by: Jacob Keller <[email protected]>
>> Cc: Julia Lawall <[email protected]>
>> Cc: Kees Cook <[email protected]>
>> Cc: Gustavo A. R. Silva <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>>
>> scripts/coccinelle/misc/struct_size.cocci | 74 +++++++++++++++++++++++
>> 1 file changed, 74 insertions(+)
>> create mode 100644 scripts/coccinelle/misc/struct_size.cocci
>
> Yes! I'd really like to get something like this into the Coccinelle
> scripts.
>
>> diff --git a/scripts/coccinelle/misc/struct_size.cocci b/scripts/coccinelle/misc/struct_size.cocci
>> new file mode 100644
>> index 000000000000..4ede9586e3c6
>> --- /dev/null
>> +++ b/scripts/coccinelle/misc/struct_size.cocci
>> @@ -0,0 +1,74 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +///
>> +/// Check for code that could use struct_size().
>> +///
>> +// Confidence: Medium
>> +// Author: Jacob Keller <[email protected]>
>> +// Copyright: (C) 2023 Intel Corporation
>> +// Options: --no-includes --include-headers
>> +
>> +virtual patch
>> +virtual context
>> +virtual org
>> +virtual report
>> +
>> +// the overflow Kunit tests have some code which intentionally does not use
>> +// the macros, so we want to ignore this code when reporting potential
>> +// issues.
>> +@overflow_tests@
>> +identifier f = overflow_size_helpers_test;
>> +@@
>> +
>> +f
>> +
>> +//----------------------------------------------------------
>> +// For context mode
>> +//----------------------------------------------------------
>> +
>> +@depends on !overflow_tests && context@
>> +expression E1, E2;
>> +identifier m;
>> +@@
>> +(
>> +* (sizeof(*E1) + (E2 * sizeof(*E1->m)))
>> +)
>> +
>> +//----------------------------------------------------------
>> +// For patch mode
>> +//----------------------------------------------------------
>> +
>> +@depends on !overflow_tests && patch@
>> +expression E1, E2;
>> +identifier m;
>> +@@
>> +(
>> +- (sizeof(*E1) + (E2 * sizeof(*E1->m)))
>> ++ struct_size(E1, m, E2)
>> +)
>
> Two notes:
>
> This can lead to false positives (like for struct mux_chip) which
> doesn't use a flexible array member, which means struct_size() will
> actually fail to build (it requires the 2nd arg to be an array).
>

I actually sent a fix for mux chip to refactor it to struct_size too :)

https://lore.kernel.org/all/[email protected]/

> This can miss cases that have more than a single struct depth (which is
> uncommon but happens). I don't know how to match only "substruct.member"
> from "ptr->substruct.member". (I know how to match the whole thing[1],
> though.)
>
Yea I couldn't figure out how to get it to handle both cases here but I
actually prefer reporting cases like mux_chip, since they can usually be
refactored to use struct size properly.

> That isn't reason not to take this patch, though. It's a good start!
>

Right. Both cases like this are why I set the confidence to only medium,
and mentioned it in the commit :D


> Thanks for writing this up!
>
> -Kees
>

Thanks for reviewing. I also sent some struct_size cleanups that look to
have stalled and could use some review or a re-send if necessary at this
point.

I think the full list can be found with this lore.kernel.org search:

https://lore.kernel.org/all/?q=f%3Ajacob.e.keller+AND+%28+s%3Astruct_size+OR+s%3A%22flexible+array%22+%29




2024-01-16 07:03:38

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH] coccinelle: semantic patch to check for potential struct_size calls

What happened to this patch? These sorts of patches go through Kees?

Also it would be nice if it could handle char arrays. It doesn't warn
for the kmalloc in dg_dispatch_as_host():

drivers/misc/vmw_vmci/vmci_datagram.c
227 dg_info = kmalloc(sizeof(*dg_info) +
228 (size_t) dg->payload_size, GFP_ATOMIC);

The Cocci check is looking specifically for:

sizeof(*dg_info) + (sizeof(*dg_info->msg_payload) * dg->payload_size)

But since this flex array is u8 there is no multiply. I don't know how
are it is to add support for char arrays...

Also another common way to write the multiply is:

sizeof(*dg_info) + (sizeof(dg_info->msg_payload[0]) * dg->payload_size)

That should be pretty straight forward to add.

regards,
dan carpenter



2024-01-17 21:54:35

by Jacob Keller

[permalink] [raw]
Subject: RE: [PATCH] coccinelle: semantic patch to check for potential struct_size calls



> -----Original Message-----
> From: Dan Carpenter <[email protected]>
> Sent: Monday, January 15, 2024 11:03 PM
> To: Keller, Jacob E <[email protected]>
> Cc: Julia Lawall <[email protected]>; Kees Cook <[email protected]>;
> Gustavo A . R . Silva <[email protected]>; [email protected]; linux-
> [email protected]; Harshit Mogalapalli <[email protected]>
> Subject: Re: [PATCH] coccinelle: semantic patch to check for potential struct_size
> calls
>
> What happened to this patch? These sorts of patches go through Kees?
>

No one replied and I got side tracked by other projects.

> Also it would be nice if it could handle char arrays. It doesn't warn
> for the kmalloc in dg_dispatch_as_host():
>
Yea it would be nice to handle this too.

> drivers/misc/vmw_vmci/vmci_datagram.c
> 227 dg_info = kmalloc(sizeof(*dg_info) +
> 228 (size_t) dg->payload_size, GFP_ATOMIC);
>
> The Cocci check is looking specifically for:
>
> sizeof(*dg_info) + (sizeof(*dg_info->msg_payload) * dg->payload_size)
>

I think that's a slightly different formulation.

> But since this flex array is u8 there is no multiply. I don't know how
> are it is to add support for char arrays...
>

A separate rule would work.

> Also another common way to write the multiply is:
>
> sizeof(*dg_info) + (sizeof(dg_info->msg_payload[0]) * dg->payload_size)
>
> That should be pretty straight forward to add.
>
> regards,
> dan carpenter
>

Mostly I just lost track of this because I struggled to get traction in the right lists and trees.

Thanks,
Jake


2024-01-18 05:18:39

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH] coccinelle: semantic patch to check for potential struct_size calls

On Wed, Jan 17, 2024 at 09:54:19PM +0000, Keller, Jacob E wrote:
> > drivers/misc/vmw_vmci/vmci_datagram.c
> > 227 dg_info = kmalloc(sizeof(*dg_info) +
> > 228 (size_t) dg->payload_size, GFP_ATOMIC);
> >
> > The Cocci check is looking specifically for:
> >
> > sizeof(*dg_info) + (sizeof(*dg_info->msg_payload) * dg->payload_size)
> >
>
> I think that's a slightly different formulation.
>

I thought that that was what the check was looking for. To me it seems
like an unusual way to write it, but it's not buggy and your Coccinelle
script did trigger a warning correctly... But yeah, I was slightly
puzzled why it would be in this format.

regards,
dan carpenter


2024-02-19 04:39:47

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] coccinelle: semantic patch to check for potential struct_size calls

On Mon, 27 Feb 2023 12:24:28 -0800, Jacob Keller wrote:
> include/linux/overflow.h includes helper macros intended for calculating
> sizes of allocations. These macros prevent accidental overflow by
> saturating at SIZE_MAX.
>
> In general when calculating such sizes use of the macros is preferred. Add
> a semantic patch which can detect code patterns which can be replaced by
> struct_size.
>
> [...]

If this needs tweaking, we can go from this one.

Applied to for-next/hardening, thanks!

[1/1] coccinelle: semantic patch to check for potential struct_size calls
https://git.kernel.org/kees/c/39fc2f86ae6a

Take care,

--
Kees Cook