2021-10-18 03:50:44

by Yang Jihong

[permalink] [raw]
Subject: [QUESTION] Performance deterioration caused by commit 85f726a35e504418

Hi Tom and Steven,

commit 85f726a35e504418 use strncpy instead of memcpy when copying comm,
on ARM64 machine, this commit causes performance degradation.

I test the number of instructions executed by invoking the
trace_sched_switch function once on an arm64 machine:
1. Use memcpy, the number of instructions executed is 850.
2. Use strncpy, the number of instructions executed 1100.
That is, use strncpy is almost 250 more instructions than memcpy.

Has the impact on performance been considered in this commit? :)
What is the impact of revert the patch?

Kind regards,
Jihong


2021-10-18 13:50:55

by Steven Rostedt

[permalink] [raw]
Subject: Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418

On Mon, 18 Oct 2021 11:23:14 +0800
Yang Jihong <[email protected]> wrote:

> Hi Tom and Steven,
>
> commit 85f726a35e504418 use strncpy instead of memcpy when copying comm,
> on ARM64 machine, this commit causes performance degradation.
>
> I test the number of instructions executed by invoking the
> trace_sched_switch function once on an arm64 machine:
> 1. Use memcpy, the number of instructions executed is 850.
> 2. Use strncpy, the number of instructions executed 1100.
> That is, use strncpy is almost 250 more instructions than memcpy.
>
> Has the impact on performance been considered in this commit? :)
> What is the impact of revert the patch?
>

It's a security issue. And like everything security, there's always going
to be a performance impact. Look at the performance impact due to spectre
and meltdown!

That said, although memcpy() may not be used, we don't need strncpy.
strncpy() will pad the rest of the string with nul bytes. But since the
memory the string is being recorded into is already initialized (or can be
if it isn't), we could use the faster strlcpy().

Have you tried testing it by switching strncpy() with strlcpy()?

-- Steve

2021-10-19 02:42:30

by Yang Jihong

[permalink] [raw]
Subject: Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418

Hi Steve,

On 2021/10/18 21:37, Steven Rostedt wrote:
> On Mon, 18 Oct 2021 11:23:14 +0800
> Yang Jihong <[email protected]> wrote:
>
>> Hi Tom and Steven,
>>
>> commit 85f726a35e504418 use strncpy instead of memcpy when copying comm,
>> on ARM64 machine, this commit causes performance degradation.
>>
>> I test the number of instructions executed by invoking the
>> trace_sched_switch function once on an arm64 machine:
>> 1. Use memcpy, the number of instructions executed is 850.
>> 2. Use strncpy, the number of instructions executed 1100.
>> That is, use strncpy is almost 250 more instructions than memcpy.
>>
>> Has the impact on performance been considered in this commit? :)
>> What is the impact of revert the patch?
>>
>
> It's a security issue. And like everything security, there's always going
> to be a performance impact. Look at the performance impact due to spectre
> and meltdown!
>
> That said, although memcpy() may not be used, we don't need strncpy.
> strncpy() will pad the rest of the string with nul bytes. But since the
> memory the string is being recorded into is already initialized (or can be
> if it isn't), we could use the faster strlcpy().
>
> Have you tried testing it by switching strncpy() with strlcpy()?
>
I have tried testing it by switching strncpy() with strlcpy(), there is
no performance improvement, probably because the strlen function is
called in strlpy and the string is traversed each time.

Kind regards,
Jihong
> -- Steve
> .
>

2021-10-19 02:54:57

by Steven Rostedt

[permalink] [raw]
Subject: Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418

On Tue, 19 Oct 2021 10:39:47 +0800
Yang Jihong <[email protected]> wrote:

> Hi Steve,
>
> On 2021/10/18 21:37, Steven Rostedt wrote:
> > On Mon, 18 Oct 2021 11:23:14 +0800
> > Yang Jihong <[email protected]> wrote:
> >
> >> Hi Tom and Steven,
> >>
> >> commit 85f726a35e504418 use strncpy instead of memcpy when copying comm,
> >> on ARM64 machine, this commit causes performance degradation.
> >>
> >> I test the number of instructions executed by invoking the
> >> trace_sched_switch function once on an arm64 machine:
> >> 1. Use memcpy, the number of instructions executed is 850.
> >> 2. Use strncpy, the number of instructions executed 1100.
> >> That is, use strncpy is almost 250 more instructions than memcpy.
> >>
> >> Has the impact on performance been considered in this commit? :)
> >> What is the impact of revert the patch?
> >>
> >
> > It's a security issue. And like everything security, there's always going
> > to be a performance impact. Look at the performance impact due to spectre
> > and meltdown!
> >
> > That said, although memcpy() may not be used, we don't need strncpy.
> > strncpy() will pad the rest of the string with nul bytes. But since the
> > memory the string is being recorded into is already initialized (or can be
> > if it isn't), we could use the faster strlcpy().
> >
> > Have you tried testing it by switching strncpy() with strlcpy()?
> >
> I have tried testing it by switching strncpy() with strlcpy(), there is
> no performance improvement, probably because the strlen function is
> called in strlpy and the string is traversed each time.

Then there's not much we can do. Security trumps performance. Not to
mention, the garbage in the comm after the '\0' causes the histograms to
produce strange results.

Now for the saved_cmdlines, since it isn't exported directly to user space,
that one may be put back to memcpy().

Tom, was there a reason to change saved_cmdlines(), as I'm not sure that is
leaked. It looks like it is printed with the normal seq_printf() in
saved_cmdlines_show().

And it doesn't even look like the saved_cmdlines() is even initialized to
zero, so it itself could leak memory if it was exposed.

-- Steve

2021-10-19 17:33:37

by Tom Zanussi

[permalink] [raw]
Subject: Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418

Hi Steve,

On 10/18/2021 9:51 PM, Steven Rostedt wrote:
> On Tue, 19 Oct 2021 10:39:47 +0800
> Yang Jihong <[email protected]> wrote:
>
>> Hi Steve,
>>
>> On 2021/10/18 21:37, Steven Rostedt wrote:
>>> On Mon, 18 Oct 2021 11:23:14 +0800
>>> Yang Jihong <[email protected]> wrote:
>>>
>>>> Hi Tom and Steven,
>>>>
>>>> commit 85f726a35e504418 use strncpy instead of memcpy when copying comm,
>>>> on ARM64 machine, this commit causes performance degradation.
>>>>
>>>> I test the number of instructions executed by invoking the
>>>> trace_sched_switch function once on an arm64 machine:
>>>> 1. Use memcpy, the number of instructions executed is 850.
>>>> 2. Use strncpy, the number of instructions executed 1100.
>>>> That is, use strncpy is almost 250 more instructions than memcpy.
>>>>
>>>> Has the impact on performance been considered in this commit? :)
>>>> What is the impact of revert the patch?
>>>>
>>>
>>> It's a security issue. And like everything security, there's always going
>>> to be a performance impact. Look at the performance impact due to spectre
>>> and meltdown!
>>>
>>> That said, although memcpy() may not be used, we don't need strncpy.
>>> strncpy() will pad the rest of the string with nul bytes. But since the
>>> memory the string is being recorded into is already initialized (or can be
>>> if it isn't), we could use the faster strlcpy().
>>>
>>> Have you tried testing it by switching strncpy() with strlcpy()?
>>>
>> I have tried testing it by switching strncpy() with strlcpy(), there is
>> no performance improvement, probably because the strlen function is
>> called in strlpy and the string is traversed each time.
>
> Then there's not much we can do. Security trumps performance. Not to
> mention, the garbage in the comm after the '\0' causes the histograms to
> produce strange results.
>
> Now for the saved_cmdlines, since it isn't exported directly to user space,
> that one may be put back to memcpy().
>
> Tom, was there a reason to change saved_cmdlines(), as I'm not sure that is
> leaked. It looks like it is printed with the normal seq_printf() in
> saved_cmdlines_show().
>

I don't think either of the changes in commit 85f726a35e504418 are directly related to the original problem [1] and therefore changing them back to memcpy or whatever shouldn't affect the histograms since that data is never used in keys.

Commit 85f726a35e504418 was basically a follow-on to commit 9f0bbf3115ca (tracing: Use strncpy instead of memcpy for string keys in hist triggers) and was added for completeness after examining other uses of memcpy in the tracing code (there's even a comment in there from you about possible performance hits from changing it ;-)

So anyway, as far as the histograms go, I think optimizing the two changes in 85f726a35e504418 while ignoring trailing garbage can be done without affecting the histogram correctness.

Tom

[1] https://lore.kernel.org/all/50c35ae1267d64eee975b8125e151e600071d4dc.1549309756.git.tom.zanussi@linux.intel.com/

> And it doesn't even look like the saved_cmdlines() is even initialized to
> zero, so it itself could leak memory if it was exposed.
>
> -- Steve
>

2021-10-19 18:11:44

by Steven Rostedt

[permalink] [raw]
Subject: Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418

On Tue, 19 Oct 2021 12:30:28 -0500
"Zanussi, Tom" <[email protected]> wrote:

> So anyway, as far as the histograms go, I think optimizing the two
> changes in 85f726a35e504418 while ignoring trailing garbage can be done
> without affecting the histogram correctness.

So, none of that is exported to user space?

-- Steve

2021-10-19 18:42:58

by Tom Zanussi

[permalink] [raw]
Subject: Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418

On 10/19/2021 1:10 PM, Steven Rostedt wrote:
> On Tue, 19 Oct 2021 12:30:28 -0500
> "Zanussi, Tom" <[email protected]> wrote:
>
>> So anyway, as far as the histograms go, I think optimizing the two
>> changes in 85f726a35e504418 while ignoring trailing garbage can be done
>> without affecting the histogram correctness.
>
> So, none of that is exported to user space?

I meant just that any optimization of those two things that ignored the
trailing garbage wouldn't affect the histogram keys.

But yeah I think you're correct that ignoring it in the case of
saved_cmdlines wouldn't be a problem either but it would be in the case of
max_buffer since that's exported by the ring buffer.

Tom

>
> -- Steve
>

2021-10-20 02:03:12

by Yang Jihong

[permalink] [raw]
Subject: Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418

Hi, Steve and Tom

On 2021/10/20 2:38, Zanussi, Tom wrote:
> On 10/19/2021 1:10 PM, Steven Rostedt wrote:
>> On Tue, 19 Oct 2021 12:30:28 -0500
>> "Zanussi, Tom" <[email protected]> wrote:
>>
>>> So anyway, as far as the histograms go, I think optimizing the two
>>> changes in 85f726a35e504418 while ignoring trailing garbage can be done
>>> without affecting the histogram correctness.
>>
>> So, none of that is exported to user space?
>
> I meant just that any optimization of those two things that ignored the
> trailing garbage wouldn't affect the histogram keys.
>
> But yeah I think you're correct that ignoring it in the case of
> saved_cmdlines wouldn't be a problem either but it would be in the case of
> max_buffer since that's exported by the ring buffer.

OK. Thanks very much for your patience. :)

Kind regards,
Jihong
>
> Tom
>
>>
>> -- Steve
>>
> .