Convert strcat() calls which only append single characters
to the end of res into simpler (and most likely cheaper)
single assignment statements.
Signed-off-by: Joey Pabalinas <[email protected]>
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/scripts/kconfig/symbol.c b/scripts/kconfig/symbol.c
index cca9663be5ddd91870..67600f48660f2ac142 100644
--- a/scripts/kconfig/symbol.c
+++ b/scripts/kconfig/symbol.c
@@ -910,8 +910,7 @@ char *sym_expand_string_value(const char *in)
* freed, so make sure to always allocate a new string
*/
reslen = strlen(in) + 1;
- res = xmalloc(reslen);
- res[0] = '\0';
+ res = xcalloc(1, reslen);
while ((src = strchr(in, '$'))) {
char *p, name[SYMBOL_MAXLENGTH];
@@ -951,7 +950,7 @@ const char *sym_escape_string_value(const char *in)
{
const char *p;
size_t reslen;
- char *res;
+ char *res, *end;
size_t l;
reslen = strlen(in) + strlen("\"\"") + 1;
@@ -968,25 +967,25 @@ const char *sym_escape_string_value(const char *in)
p++;
}
- res = xmalloc(reslen);
- res[0] = '\0';
-
- strcat(res, "\"");
+ res = xcalloc(1, reslen);
+ end = res;
+ *end++ = '\"';
p = in;
for (;;) {
l = strcspn(p, "\"\\");
strncat(res, p, l);
p += l;
+ end += l;
if (p[0] == '\0')
break;
- strcat(res, "\\");
- strncat(res, p++, 1);
+ *end++ = '\\';
+ *end++ = *p++;
}
+ *end = '\"';
- strcat(res, "\"");
return res;
}
--
2.16.2
Sorry about the double send, clipboards are very fickle beasts :(
On Thu, Mar 01, 2018 at 09:44:24PM -1000, Joey Pabalinas wrote:
> Convert strcat() calls which only append single characters
> to the end of res into simpler (and most likely cheaper)
> single assignment statements.
>
> Signed-off-by: Joey Pabalinas <[email protected]>
>
> 1 file changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/scripts/kconfig/symbol.c b/scripts/kconfig/symbol.c
> index cca9663be5ddd91870..67600f48660f2ac142 100644
> --- a/scripts/kconfig/symbol.c
> +++ b/scripts/kconfig/symbol.c
> @@ -910,8 +910,7 @@ char *sym_expand_string_value(const char *in)
> * freed, so make sure to always allocate a new string
> */
> reslen = strlen(in) + 1;
> - res = xmalloc(reslen);
> - res[0] = '\0';
> + res = xcalloc(1, reslen);
Not sure this is an improvement. Zeroing the bytes after the initial
null terminator is redundant, and the explicit '\0' makes it clearer to
me what's going on.
>
> while ((src = strchr(in, '$'))) {
> char *p, name[SYMBOL_MAXLENGTH];
> @@ -951,7 +950,7 @@ const char *sym_escape_string_value(const char *in)
> {
> const char *p;
> size_t reslen;
> - char *res;
> + char *res, *end;
> size_t l;
>
> reslen = strlen(in) + strlen("\"\"") + 1;
> @@ -968,25 +967,25 @@ const char *sym_escape_string_value(const char *in)
> p++;
> }
>
> - res = xmalloc(reslen);
> - res[0] = '\0';
> -
> - strcat(res, "\"");
> + res = xcalloc(1, reslen);
> + end = res;
> + *end++ = '\"';
Could make this xmalloc() instead and write the null terminator via
'end' -- see below.
>
> p = in;
> for (;;) {
> l = strcspn(p, "\"\\");
> strncat(res, p, l);
Could replace this with memcpy(end, p, l).
At that point, all the writing would be done via 'end', and 'res' would
just be a way to remember the starting address of the result.
> p += l;
> + end += l;
>
> if (p[0] == '\0')
Unrelated, but *p is clearer than p[0] to me when doing pointers rather
than indices.
> break;
>
> - strcat(res, "\\");
> - strncat(res, p++, 1);
> + *end++ = '\\';
> + *end++ = *p++;
> }
> + *end = '\"';
With all the writing done via 'end', the null terminator could be
written here.
>
> - strcat(res, "\"");
> return res;
> }
>
> --
> 2.16.2
>
> Sorry about the double send, clipboards are very fickle beasts :(
I like the approach, but I wonder if we can take it a bit further.
Here's what I'd do:
1. Rename the 'in' parameter to 's'.
2. Rename 'p' to 'in'.
3. Rename 'end' to 'out'
At that point, you're reading from 'in' and writing to 'out', which
seems pretty nice and readable.
This code is pretty cold by the way, so it wouldn't matter for
performance. GCC knows how functions like strcat() work too, and uses
that to optimize (see
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html).
I'm all for trying to make Kconfig's code neater though.
Tip: If you want to run Valgrind, you can do it with a command like
$ ARCH=x86 SRCARCH=x86 KERNELVERSION=`make kernelversion` valgrind scripts/kconfig/mconf Kconfig
That works the same as
$ make menuconfig
Cheers,
Ulf
On Fri, Mar 02, 2018 at 02:44:53PM +0100, Ulf Magnusson wrote:
> Not sure this is an improvement. Zeroing the bytes after the initial
> null terminator is redundant, and the explicit '\0' makes it clearer to
> me what's going on.
Yes, I agree with you, that is definitely quite true. This along with
the other comments you made me want to rethink this a little bit.
On Fri, Mar 02, 2018 at 02:44:53PM +0100, Ulf Magnusson wrote:
> I like the approach, but I wonder if we can take it a bit further.
> Here's what I'd do:
>
> 1. Rename the 'in' parameter to 's'.
> 2. Rename 'p' to 'in'.
> 3. Rename 'end' to 'out'
>
> At that point, you're reading from 'in' and writing to 'out', which
> seems pretty nice and readable.
>
> This code is pretty cold by the way, so it wouldn't matter for
> performance. GCC knows how functions like strcat() work too, and uses
> that to optimize (see
> https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html).
>
> I'm all for trying to make Kconfig's code neater though.
Since this code is pretty cold (completely agree with you there), I
think it would actually be much more useful to rework my patch to
have a more style-centric approach rather than an optimization-centric
one; this code would definitely benefit from being neater.
Some useful changes would be to rename of the _atrociously_ short
identifiers like p and l.
Anyway I'll give that link a read over and try and make a V2 later
on today.
Appreciate the feedback, thanks for the comments!
--
Cheers,
Joey Pabalinas
On Sat, Mar 3, 2018 at 12:12 AM, Joey Pabalinas <[email protected]> wrote:
> On Fri, Mar 02, 2018 at 02:44:53PM +0100, Ulf Magnusson wrote:
>> Not sure this is an improvement. Zeroing the bytes after the initial
>> null terminator is redundant, and the explicit '\0' makes it clearer to
>> me what's going on.
>
> Yes, I agree with you, that is definitely quite true. This along with
> the other comments you made me want to rethink this a little bit.
>
> On Fri, Mar 02, 2018 at 02:44:53PM +0100, Ulf Magnusson wrote:
>> I like the approach, but I wonder if we can take it a bit further.
>> Here's what I'd do:
>>
>> 1. Rename the 'in' parameter to 's'.
>> 2. Rename 'p' to 'in'.
>> 3. Rename 'end' to 'out'
>>
>> At that point, you're reading from 'in' and writing to 'out', which
>> seems pretty nice and readable.
>>
>> This code is pretty cold by the way, so it wouldn't matter for
>> performance. GCC knows how functions like strcat() work too, and uses
>> that to optimize (see
>> https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html).
>>
>> I'm all for trying to make Kconfig's code neater though.
>
> Since this code is pretty cold (completely agree with you there), I
> think it would actually be much more useful to rework my patch to
> have a more style-centric approach rather than an optimization-centric
> one; this code would definitely benefit from being neater.
I actually prefer the memcpy() version for style reasons, even though
it might've looked like an optimization:
With strncat(), the result string is written via both 'res' and 'end'.
With memcpy(), it's only written via the 'end'. That seems less
twisty.
Maybe this is outside the scope of the original patch, but while we're here. :)
>
> Some useful changes would be to rename of the _atrociously_ short
> identifiers like p and l.
Yeah, 'l' in particular isn't the best name, IMO ('len' is both short
and explicit, and won't be confused for 1). 'p' can be fine if it's
obvious in context (bit dubious here), but 'in' and 'out' (for 'end')
would be more informative.
's' is clear from convention to me. In general, I fully agree that you
should avoid hard-to-guess names though.
>
> Anyway I'll give that link a read over and try and make a V2 later
> on today.
>
> Appreciate the feedback, thanks for the comments!
>
> --
> Cheers,
> Joey Pabalinas
Cheers,
Ulf
On Sat, Mar 3, 2018 at 7:14 PM, Ulf Magnusson <[email protected]> wrote:
> On Sat, Mar 3, 2018 at 12:12 AM, Joey Pabalinas <[email protected]> wrote:
>> On Fri, Mar 02, 2018 at 02:44:53PM +0100, Ulf Magnusson wrote:
>>> Not sure this is an improvement. Zeroing the bytes after the initial
>>> null terminator is redundant, and the explicit '\0' makes it clearer to
>>> me what's going on.
>>
>> Yes, I agree with you, that is definitely quite true. This along with
>> the other comments you made me want to rethink this a little bit.
>>
>> On Fri, Mar 02, 2018 at 02:44:53PM +0100, Ulf Magnusson wrote:
>>> I like the approach, but I wonder if we can take it a bit further.
>>> Here's what I'd do:
>>>
>>> 1. Rename the 'in' parameter to 's'.
>>> 2. Rename 'p' to 'in'.
>>> 3. Rename 'end' to 'out'
>>>
>>> At that point, you're reading from 'in' and writing to 'out', which
>>> seems pretty nice and readable.
>>>
>>> This code is pretty cold by the way, so it wouldn't matter for
>>> performance. GCC knows how functions like strcat() work too, and uses
>>> that to optimize (see
>>> https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html).
>>>
>>> I'm all for trying to make Kconfig's code neater though.
>>
>> Since this code is pretty cold (completely agree with you there), I
>> think it would actually be much more useful to rework my patch to
>> have a more style-centric approach rather than an optimization-centric
>> one; this code would definitely benefit from being neater.
>
> I actually prefer the memcpy() version for style reasons, even though
> it might've looked like an optimization:
>
> With strncat(), the result string is written via both 'res' and 'end'.
> With memcpy(), it's only written via the 'end'. That seems less
> twisty.
>
> Maybe this is outside the scope of the original patch, but while we're here. :)
>
>>
>> Some useful changes would be to rename of the _atrociously_ short
>> identifiers like p and l.
>
> Yeah, 'l' in particular isn't the best name, IMO ('len' is both short
> and explicit, and won't be confused for 1). 'p' can be fine if it's
> obvious in context (bit dubious here), but 'in' and 'out' (for 'end')
> would be more informative.
Another alternative: src/dst
>
> 's' is clear from convention to me. In general, I fully agree that you
> should avoid hard-to-guess names though.
>
>>
>> Anyway I'll give that link a read over and try and make a V2 later
>> on today.
>>
>> Appreciate the feedback, thanks for the comments!
>>
>> --
>> Cheers,
>> Joey Pabalinas
>
> Cheers,
> Ulf
Cheers,
Ulf