2022-05-23 07:24:49

by Stefan Wahren

[permalink] [raw]
Subject: vchiq: Performance regression since 5.18-rc1

Hi,

while testing the staging/vc04_services/interface/vchiq_arm driver with
my Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
lru_cache_disable: replace work queue synchronization with synchronize_rcu

Usually i run "vchiq_test -f 1" to see the driver is still working [1].

Before commit:

real    0m1,500s
user    0m0,068s
sys    0m0,846s

After commit:

real    7m11,449s
user    0m2,049s
sys    0m0,023s

Best regards

[1] - https://github.com/raspberrypi/userland




2022-05-23 08:04:56

by Paul E. McKenney

[permalink] [raw]
Subject: Re: vchiq: Performance regression since 5.18-rc1

On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
> Hi,
>
> while testing the staging/vc04_services/interface/vchiq_arm driver with my
> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>
> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>
> Before commit:
>
> real??? 0m1,500s
> user??? 0m0,068s
> sys??? 0m0,846s
>
> After commit:
>
> real??? 7m11,449s
> user??? 0m2,049s
> sys??? 0m0,023s
>
> Best regards
>
> [1] - https://github.com/raspberrypi/userland

Please feel free to try the patch shown below. Or the pair of patches
from Rik here:

https://lore.kernel.org/lkml/[email protected]/
https://lore.kernel.org/lkml/[email protected]/

There is work ongoing to produce something better, but ongoing slowly.
Especially my part of that work.

Thanx, Paul

------------------------------------------------------------------------

From [email protected] Mon Feb 14 11:05:49 2022
Date: Mon, 14 Feb 2022 11:05:49 -0800
From: "Paul E. McKenney" <[email protected]>
To: [email protected]
Cc: [email protected], [email protected], [email protected],
[email protected], [email protected]
Subject: [PATCH RFC fs/namespace] Make kern_unmount() use
synchronize_rcu_expedited()
Message-ID: <20220214190549.GA2815154@paulmck-ThinkPad-P17-Gen-1>
Reply-To: [email protected]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Status: RO
Content-Length: 1036
Lines: 32

Experimental. Not for inclusion. Yet, anyway.

Freeing large numbers of namespaces in quick succession can result in
a bottleneck on the synchronize_rcu() invoked from kern_unmount().
This patch applies the synchronize_rcu_expedited() hammer to allow
further testing and fault isolation.

Hey, at least there was no need to change the comment! ;-)

Cc: Alexander Viro <[email protected]>
Cc: <[email protected]>
Cc: <[email protected]>
Not-yet-signed-off-by: Paul E. McKenney <[email protected]>

---

namespace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index 40b994a29e90d..79c50ad0ade5b 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -4389,7 +4389,7 @@ void kern_unmount(struct vfsmount *mnt)
/* release long term mount so mount point can be released */
if (!IS_ERR_OR_NULL(mnt)) {
real_mount(mnt)->mnt_ns = NULL;
- synchronize_rcu(); /* yecchhh... */
+ synchronize_rcu_expedited(); /* yecchhh... */
mntput(mnt);
}
}


Subject: Re: vchiq: Performance regression since 5.18-rc1

On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
> Hi,
Hi,

> while testing the staging/vc04_services/interface/vchiq_arm driver with my
> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>
> Usually i run "vchiq_test -f 1" to see the driver is still working [1].

What about
https://lore.kernel.org/all/YmrWK%[email protected]/

Sebastian

Subject: Re: vchiq: Performance regression since 5.18-rc1

[TLDR: I'm adding this regression report to the list of tracked
regressions; all text from me you find below is based on a few templates
paragraphs you might have encountered already already in similar form.]

Hi, this is your Linux kernel regression tracker.

On 22.05.22 01:22, Stefan Wahren wrote:
>
> while testing the staging/vc04_services/interface/vchiq_arm driver with
> my Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>
> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>
> Before commit:
>
> real    0m1,500s
> user    0m0,068s
> sys    0m0,846s
>
> After commit:
>
> real    7m11,449s
> user    0m2,049s
> sys    0m0,023s

Thanks for the report.

To be sure below issue doesn't fall through the cracks unnoticed, I'm
adding it to regzbot, my Linux kernel regression tracking bot:

#regzbot ^introduced ff042f4a9b050895a42cae893cc01fa2ca81b95
#regzbot title mm: chiq_test runs 7 minutes instead of ~ 1 second.
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replied to), as the kernel's
documentation call for; above page explains why this is important for
tracked regressions.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

Subject: Re: vchiq: Performance regression since 5.18-rc1

On 2022-05-25 16:07:47 [+0200], Stefan Wahren wrote:
> this was the same as Paul send. I think i need more time for investigation,
> maybe there is an issue with the application.

I haven't seen Paul referring to *that* patch. He pointed to some fs/
related changes.

Sebastian

2022-05-25 16:14:36

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: vchiq: Performance regression since 5.18-rc1

On Wed, May 25, 2022 at 04:07:47PM +0200, Stefan Wahren wrote:
> Hi Marcelo,
>
> Am 25.05.22 um 15:56 schrieb Marcelo Tosatti:
> > On Mon, May 23, 2022 at 09:09:07AM +0200, Sebastian Andrzej Siewior wrote:
> > > On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
> > > > Hi,
> > > Hi,
> > >
> > > > while testing the staging/vc04_services/interface/vchiq_arm driver with my
> > > > Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> > > > regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> > > > lru_cache_disable: replace work queue synchronization with synchronize_rcu
> > > >
> > > > Usually i run "vchiq_test -f 1" to see the driver is still working [1].
> > > What about
> > > https://lore.kernel.org/all/YmrWK%[email protected]/
> > >
> > > Sebastian
> > Stefan,
> >
> > Can you please try the patch above ?
>
> this was the same as Paul send. I think i need more time for investigation,
> maybe there is an issue with the application.

To clarify: they are not the same patches.

>
> All i noticed so far is that in good case the CPU usage is around ~ 60 % and
> higher, while in bad case the CPU is almost idle. Also the issue is not
> reproducible with arm64/defconfig.
>
> >
> >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > [email protected]
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
>


2022-05-25 21:45:47

by Paul E. McKenney

[permalink] [raw]
Subject: Re: vchiq: Performance regression since 5.18-rc1

On Wed, May 25, 2022 at 04:26:27PM +0200, Sebastian Andrzej Siewior wrote:
> On 2022-05-25 16:07:47 [+0200], Stefan Wahren wrote:
> > this was the same as Paul send. I think i need more time for investigation,
> > maybe there is an issue with the application.
>
> I haven't seen Paul referring to *that* patch. He pointed to some fs/
> related changes.

True! Both patches changed from a synchronize_rcu() to a
synchronize_rcu_expedited(), but different instances of synchronize_rcu().

Thanx, Paul

2022-05-26 00:17:30

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: vchiq: Performance regression since 5.18-rc1

On Mon, May 23, 2022 at 09:09:07AM +0200, Sebastian Andrzej Siewior wrote:
> On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
> > Hi,
> Hi,
>
> > while testing the staging/vc04_services/interface/vchiq_arm driver with my
> > Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> > regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> > lru_cache_disable: replace work queue synchronization with synchronize_rcu
> >
> > Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>
> What about
> https://lore.kernel.org/all/YmrWK%[email protected]/
>
> Sebastian

Stefan,

Can you please try the patch above ?



2022-05-26 23:55:42

by Stefan Wahren

[permalink] [raw]
Subject: Re: vchiq: Performance regression since 5.18-rc1

Hi Marcelo,

Am 25.05.22 um 15:56 schrieb Marcelo Tosatti:
> On Mon, May 23, 2022 at 09:09:07AM +0200, Sebastian Andrzej Siewior wrote:
>> On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
>>> Hi,
>> Hi,
>>
>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>
>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>> What about
>> https://lore.kernel.org/all/YmrWK%[email protected]/
>>
>> Sebastian
> Stefan,
>
> Can you please try the patch above ?

this was the same as Paul send. I think i need more time for
investigation, maybe there is an issue with the application.

All i noticed so far is that in good case the CPU usage is around ~ 60 %
and higher, while in bad case the CPU is almost idle. Also the issue is
not reproducible with arm64/defconfig.

>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

2022-05-30 11:28:06

by Stefan Wahren

[permalink] [raw]
Subject: Re: vchiq: Performance regression since 5.18-rc1

Am 25.05.22 um 17:37 schrieb Marcelo Tosatti:
> On Wed, May 25, 2022 at 04:07:47PM +0200, Stefan Wahren wrote:
>> Hi Marcelo,
>>
>> Am 25.05.22 um 15:56 schrieb Marcelo Tosatti:
>>> On Mon, May 23, 2022 at 09:09:07AM +0200, Sebastian Andrzej Siewior wrote:
>>>> On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
>>>>> Hi,
>>>> Hi,
>>>>
>>>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>>>
>>>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>>>> What about
>>>> https://lore.kernel.org/all/YmrWK%[email protected]/
>>>>
>>>> Sebastian
>>> Stefan,
>>>
>>> Can you please try the patch above ?
>> this was the same as Paul send. I think i need more time for investigation,
>> maybe there is an issue with the application.
> To clarify: they are not the same patches.
Thanks for pointing out. I will test it ASAP.
>
>> All i noticed so far is that in good case the CPU usage is around ~ 60 % and
>> higher, while in bad case the CPU is almost idle. Also the issue is not
>> reproducible with arm64/defconfig.
>>
>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> [email protected]
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>

2022-05-31 08:22:16

by Stefan Wahren

[permalink] [raw]
Subject: Re: vchiq: Performance regression since 5.18-rc1

Hi Marcelo,
hi Sebastian,

Am 25.05.22 um 15:56 schrieb Marcelo Tosatti:
> On Mon, May 23, 2022 at 09:09:07AM +0200, Sebastian Andrzej Siewior wrote:
>> On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
>>> Hi,
>> Hi,
>>
>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>
>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>> What about
>> https://lore.kernel.org/all/YmrWK%[email protected]/
>>
>> Sebastian
> Stefan,
>
> Can you please try the patch above ?

this patch fixes the regression. Great

Best regards

>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel