I am going to quote Lee Jones who has been doing some snprintf ->
scnprintf refactorings:
"There is a general misunderstanding amongst engineers that
{v}snprintf() returns the length of the data *actually* encoded into the
destination array. However, as per the C99 standard {v}snprintf()
really returns the length of the data that *would have been* written if
there were enough space for it. This misunderstanding has led to
buffer-overruns in the past. It's generally considered safer to use the
{v}scnprintf() variants in their place (or even sprintf() in simple
cases). So let's do that."
To help prevent new instances of snprintf() from popping up, let's add a
check to checkpatch.pl.
Suggested-by: Finn Thain <[email protected]>
Signed-off-by: Justin Stitt <[email protected]>
---
Changes in v6:
- move capture group to only include symbol name (not spaces or paren)
- Link to v5: https://lore.kernel.org/r/[email protected]
Changes in v5:
- use capture groups to let the user know which variation they used
- Link to v4: https://lore.kernel.org/r/[email protected]
Changes in v4:
- also check for vsnprintf variant (thanks Bill)
- Link to v3: https://lore.kernel.org/r/[email protected]
Changes in v3:
- fix indentation
- add reference link (https://github.com/KSPP/linux/issues/105) (thanks Joe)
- Link to v2: https://lore.kernel.org/r/[email protected]
Changes in v2:
- Had a vim moment and deleted a character before sending the patch.
- Replaced the character :)
- Link to v1: https://lore.kernel.org/r/[email protected]
---
From a discussion here [1].
[1]: https://lore.kernel.org/all/[email protected]/
---
scripts/checkpatch.pl | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 9c4c4a61bc83..bdae8a7ae5ed 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -7012,6 +7012,12 @@ sub process {
"Prefer strscpy, strscpy_pad, or __nonstring over strncpy - see: https://github.com/KSPP/linux/issues/90\n" . $herecurr);
}
+# {v}snprintf uses that should likely be {v}scnprintf
+ if ($line =~ /\b((v|)snprintf)\s*\(/) {
+ WARN("SNPRINTF",
+ "Prefer ${2}scnprintf over $1 - see: https://github.com/KSPP/linux/issues/105\n" . $herecurr);
+ }
+
# ethtool_sprintf uses that should likely be ethtool_puts
if ($line =~ /\bethtool_sprintf\s*\(\s*$FuncArg\s*,\s*$FuncArg\s*\)/) {
if (WARN("PREFER_ETHTOOL_PUTS",
---
base-commit: b401b621758e46812da61fa58a67c3fd8d91de0d
change-id: 20240221-snprintf-checkpatch-a864ed67ebd0
Best regards,
--
Justin Stitt <[email protected]>
On Mon, Apr 29, 2024 at 06:39:28PM +0000, Justin Stitt wrote:
> I am going to quote Lee Jones who has been doing some snprintf ->
> scnprintf refactorings:
>
> "There is a general misunderstanding amongst engineers that
> {v}snprintf() returns the length of the data *actually* encoded into the
> destination array. However, as per the C99 standard {v}snprintf()
> really returns the length of the data that *would have been* written if
> there were enough space for it. This misunderstanding has led to
> buffer-overruns in the past. It's generally considered safer to use the
> {v}scnprintf() variants in their place (or even sprintf() in simple
> cases). So let's do that."
>
> To help prevent new instances of snprintf() from popping up, let's add a
> check to checkpatch.pl.
>
> Suggested-by: Finn Thain <[email protected]>
> Signed-off-by: Justin Stitt <[email protected]>
Thanks!
Reviewed-by: Kees Cook <[email protected]>
--
Kees Cook
On Mon, 2024-04-29 at 12:49 -0700, Kees Cook wrote:
> On Mon, Apr 29, 2024 at 06:39:28PM +0000, Justin Stitt wrote:
> > I am going to quote Lee Jones who has been doing some snprintf ->
> > scnprintf refactorings:
> >
> > "There is a general misunderstanding amongst engineers that
> > {v}snprintf() returns the length of the data *actually* encoded into the
> > destination array. However, as per the C99 standard {v}snprintf()
> > really returns the length of the data that *would have been* written if
> > there were enough space for it. This misunderstanding has led to
> > buffer-overruns in the past. It's generally considered safer to use the
> > {v}scnprintf() variants in their place (or even sprintf() in simple
> > cases). So let's do that."
> >
> > To help prevent new instances of snprintf() from popping up, let's add a
> > check to checkpatch.pl.
> >
> > Suggested-by: Finn Thain <[email protected]>
> > Signed-off-by: Justin Stitt <[email protected]>
>
> Thanks!
>
> Reviewed-by: Kees Cook <[email protected]>
>
$ git grep -P '\b((v|)snprintf)\s*\(' | wc -l
7745
$ git grep -P '(?:return\s+|=\s*)\b((v|)snprintf)\s*\(' | wc -l
1626
Given there are ~5000 uses of these that don't care
whether or not it's snprintf or scnprintf, I think this
is not great.
I'd much rather make sure the return value of the call
is used before suggesting an alternative.
$ git grep -P '\b((v|)snprintf)\s*\(.*PAGE_SIZE' | wc -l
515
And about 1/3 of these snprintf calls are for sysfs style
output that ideally would be converted to sysfs_emit or
sysfs_emit_at instead.
On Mon, 29 Apr 2024, Justin Stitt wrote:
> I am going to quote Lee Jones who has been doing some snprintf ->
> scnprintf refactorings:
>
> "There is a general misunderstanding amongst engineers that
> {v}snprintf() returns the length of the data *actually* encoded into the
> destination array. However, as per the C99 standard {v}snprintf()
> really returns the length of the data that *would have been* written if
> there were enough space for it. This misunderstanding has led to
> buffer-overruns in the past. It's generally considered safer to use the
> {v}scnprintf() variants in their place (or even sprintf() in simple
> cases). So let's do that."
>
> To help prevent new instances of snprintf() from popping up, let's add a
> check to checkpatch.pl.
>
> Suggested-by: Finn Thain <[email protected]>
> Signed-off-by: Justin Stitt <[email protected]>
> ---
> Changes in v6:
> - move capture group to only include symbol name (not spaces or paren)
> - Link to v5: https://lore.kernel.org/r/[email protected]
>
> Changes in v5:
> - use capture groups to let the user know which variation they used
> - Link to v4: https://lore.kernel.org/r/[email protected]
>
> Changes in v4:
> - also check for vsnprintf variant (thanks Bill)
> - Link to v3: https://lore.kernel.org/r/[email protected]
>
> Changes in v3:
> - fix indentation
> - add reference link (https://github.com/KSPP/linux/issues/105) (thanks Joe)
> - Link to v2: https://lore.kernel.org/r/[email protected]
>
> Changes in v2:
> - Had a vim moment and deleted a character before sending the patch.
> - Replaced the character :)
> - Link to v1: https://lore.kernel.org/r/[email protected]
> ---
> From a discussion here [1].
>
> [1]: https://lore.kernel.org/all/[email protected]/
> ---
> scripts/checkpatch.pl | 6 ++++++
> 1 file changed, 6 insertions(+)
Reviewed-by: Lee Jones <[email protected]>
--
Lee Jones [李琼斯]
On Mon, 29 Apr 2024, Joe Perches wrote:
> On Mon, 2024-04-29 at 12:49 -0700, Kees Cook wrote:
> > On Mon, Apr 29, 2024 at 06:39:28PM +0000, Justin Stitt wrote:
> > > I am going to quote Lee Jones who has been doing some snprintf ->
> > > scnprintf refactorings:
> > >
> > > "There is a general misunderstanding amongst engineers that
> > > {v}snprintf() returns the length of the data *actually* encoded into the
> > > destination array. However, as per the C99 standard {v}snprintf()
> > > really returns the length of the data that *would have been* written if
> > > there were enough space for it. This misunderstanding has led to
> > > buffer-overruns in the past. It's generally considered safer to use the
> > > {v}scnprintf() variants in their place (or even sprintf() in simple
> > > cases). So let's do that."
> > >
> > > To help prevent new instances of snprintf() from popping up, let's add a
> > > check to checkpatch.pl.
> > >
> > > Suggested-by: Finn Thain <[email protected]>
> > > Signed-off-by: Justin Stitt <[email protected]>
> >
> > Thanks!
> >
> > Reviewed-by: Kees Cook <[email protected]>
> >
>
> $ git grep -P '\b((v|)snprintf)\s*\(' | wc -l
> 7745
> $ git grep -P '(?:return\s+|=\s*)\b((v|)snprintf)\s*\(' | wc -l
> 1626
>
> Given there are ~5000 uses of these that don't care
> whether or not it's snprintf or scnprintf, I think this
> is not great.
>
> I'd much rather make sure the return value of the call
> is used before suggesting an alternative.
>
> $ git grep -P '\b((v|)snprintf)\s*\(.*PAGE_SIZE' | wc -l
> 515
>
> And about 1/3 of these snprintf calls are for sysfs style
> output that ideally would be converted to sysfs_emit or
> sysfs_emit_at instead.
I am working on the migration of these (this patch was spun off from
that project in fact). Some subsystems are currently prioritising the
status quo (a.k.a. "no churn"), but most have been accepting of the
changes.
Planning to get back to it once the CVE project has calmed a little.
Those numbers should diminish over time.
--
Lee Jones [李琼斯]
On Mon, Apr 29, 2024 at 08:21:49PM -0700, Joe Perches wrote:
> On Mon, 2024-04-29 at 12:49 -0700, Kees Cook wrote:
> > On Mon, Apr 29, 2024 at 06:39:28PM +0000, Justin Stitt wrote:
> > > I am going to quote Lee Jones who has been doing some snprintf ->
> > > scnprintf refactorings:
> > >
> > > "There is a general misunderstanding amongst engineers that
> > > {v}snprintf() returns the length of the data *actually* encoded into the
> > > destination array. However, as per the C99 standard {v}snprintf()
> > > really returns the length of the data that *would have been* written if
> > > there were enough space for it. This misunderstanding has led to
> > > buffer-overruns in the past. It's generally considered safer to use the
> > > {v}scnprintf() variants in their place (or even sprintf() in simple
> > > cases). So let's do that."
> > >
> > > To help prevent new instances of snprintf() from popping up, let's add a
> > > check to checkpatch.pl.
> > >
> > > Suggested-by: Finn Thain <[email protected]>
> > > Signed-off-by: Justin Stitt <[email protected]>
> >
> > Thanks!
> >
> > Reviewed-by: Kees Cook <[email protected]>
> >
>
> $ git grep -P '\b((v|)snprintf)\s*\(' | wc -l
> 7745
> $ git grep -P '(?:return\s+|=\s*)\b((v|)snprintf)\s*\(' | wc -l
> 1626
>
> Given there are ~5000 uses of these that don't care
> whether or not it's snprintf or scnprintf, I think this
> is not great.
But let's not add more of either case. :)
> I'd much rather make sure the return value of the call
> is used before suggesting an alternative.
>
> $ git grep -P '\b((v|)snprintf)\s*\(.*PAGE_SIZE' | wc -l
> 515
>
> And about 1/3 of these snprintf calls are for sysfs style
> output that ideally would be converted to sysfs_emit or
> sysfs_emit_at instead.
Detecting that we're in the right place for sysfs_emit seems out of
scope for here, but maybe it should be more clearly called out by the
contents at the reported URL?
--
Kees Cook