2024-03-31 20:21:47

by Mikhail Gavrilov

[permalink] [raw]
Subject: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

Hi,

I spotted that FPS was around 122 [1] in the Shadow of the Tomb Raider
benchmark at commit f6cef5f8c37f but after moving to commit
4ae3dc83b047 it decreased to 84 [2].

I bisected it and the first bad commit was 587d67fd929a.
Author: Takashi Iwai <[email protected]>
Date: Fri Mar 15 11:14:42 2024 +0100

ALSA: timer: Fix missing irq-disable at closing

The conversion to guard macro dropped the irq-disablement at closing
mistakenly, which may lead to a race. Fix it.

Fixes: beb45974dd49 ("ALSA: timer: Use guard() for locking")
Reported-by: [email protected]
Closes: http://lore.kernel.org/r/[email protected]
Message-ID: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>

sound/core/timer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

To make sure of this I builded kernel at commit 712e14250dd2 with
revert 587d67fd929a and the benchmarking result was returned to best
observed by me.

[1] good results - https://postimg.cc/G4NPHMyk
[2] bad results - https://postimg.cc/1n0D7sYH

I attached here my build .config and kernel log.
Is it possible to find a better approach than 587d67fd929a?

--
Best Regards,
Mike Gavrilov.


Attachments:
.config.zip (64.49 kB)
6.9.0-rc1-test-rev-587d67fd929ad89801bcc429675bda90d53f6592.zip (56.07 kB)
Download all attachments

2024-04-01 07:07:50

by Takashi Iwai

[permalink] [raw]
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

On Sun, 31 Mar 2024 22:21:20 +0200,
Mikhail Gavrilov wrote:
>
> Hi,
>
> I spotted that FPS was around 122 [1] in the Shadow of the Tomb Raider
> benchmark at commit f6cef5f8c37f but after moving to commit
> 4ae3dc83b047 it decreased to 84 [2].
>
> I bisected it and the first bad commit was 587d67fd929a.
> Author: Takashi Iwai <[email protected]>
> Date: Fri Mar 15 11:14:42 2024 +0100
>
> ALSA: timer: Fix missing irq-disable at closing
>
> The conversion to guard macro dropped the irq-disablement at closing
> mistakenly, which may lead to a race. Fix it.
>
> Fixes: beb45974dd49 ("ALSA: timer: Use guard() for locking")
> Reported-by: [email protected]
> Closes: http://lore.kernel.org/r/[email protected]
> Message-ID: <[email protected]>
> Signed-off-by: Takashi Iwai <[email protected]>
>
> sound/core/timer.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> To make sure of this I builded kernel at commit 712e14250dd2 with
> revert 587d67fd929a and the benchmarking result was returned to best
> observed by me.
>
> [1] good results - https://postimg.cc/G4NPHMyk
> [2] bad results - https://postimg.cc/1n0D7sYH
>
> I attached here my build .config and kernel log.
> Is it possible to find a better approach than 587d67fd929a?

Is it a regression against the performance on 6.8?

If so, what happens when you revert both this (587d67fd929a) and
beb45974dd49? That should make the timer code back to 6.8 code.


Takashi

2024-04-01 11:07:50

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

On Mon, Apr 1, 2024 at 12:07 PM Takashi Iwai <[email protected]> wrote:
>
> Is it a regression against the performance on 6.8?
>
> If so, what happens when you revert both this (587d67fd929a) and
> beb45974dd49? That should make the timer code back to 6.8 code.
>
>
> Takashi

The commit beb45974dd49 boosted performance to 30%, but commit
587d67fd929a canceled this boost.

I want to say that reverting commit beb45974dd49 returns performance
to 83FPS [1] and this is sad.

So I ask again is it possible to find a better approach than 587d67fd929a?

[1] https://postimg.cc/xJMjkcHP

--
Best Regards,
Mike Gavrilov.


Attachments:
6.9.0-rc1-test-rev-587d67fd929a-beb45974dd49.zip (54.25 kB)

2024-04-01 11:42:02

by Takashi Iwai

[permalink] [raw]
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

On Mon, 01 Apr 2024 13:07:26 +0200,
Mikhail Gavrilov wrote:
>
> On Mon, Apr 1, 2024 at 12:07 PM Takashi Iwai <[email protected]> wrote:
> >
> > Is it a regression against the performance on 6.8?
> >
> > If so, what happens when you revert both this (587d67fd929a) and
> > beb45974dd49? That should make the timer code back to 6.8 code.
> >
> >
> > Takashi
>
> The commit beb45974dd49 boosted performance to 30%, but commit
> 587d67fd929a canceled this boost.
>
> I want to say that reverting commit beb45974dd49 returns performance
> to 83FPS [1] and this is sad.
>
> So I ask again is it possible to find a better approach than 587d67fd929a?
>
> [1] https://postimg.cc/xJMjkcHP

The only code of the commit 587d67fd929a is the close of ALSA timer
device. If it really matters, it implies that the user-space seems
trying to open/close the ALSA timer device so frequently.

Are you using ALSA dmix/dsnoop? It would explain.


Takashi

2024-04-01 13:49:55

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

On Mon, Apr 1, 2024 at 4:41 PM Takashi Iwai <[email protected]> wrote:
> The only code of the commit 587d67fd929a is the close of ALSA timer
> device. If it really matters, it implies that the user-space seems
> trying to open/close the ALSA timer device so frequently.
>
> Are you using ALSA dmix/dsnoop? It would explain.

It's new words for me. I didn’t configure anything like this myself,
only if the maintainers of my distribution enable it by default.
I only know that Fedora Rawhide uses Pipewire for mixing sound.

--
Best Regards,
Mike Gavrilov.

2024-04-01 14:48:31

by Takashi Iwai

[permalink] [raw]
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

On Mon, 01 Apr 2024 15:49:33 +0200,
Mikhail Gavrilov wrote:
>
> On Mon, Apr 1, 2024 at 4:41 PM Takashi Iwai <[email protected]> wrote:
> > The only code of the commit 587d67fd929a is the close of ALSA timer
> > device. If it really matters, it implies that the user-space seems
> > trying to open/close the ALSA timer device so frequently.
> >
> > Are you using ALSA dmix/dsnoop? It would explain.
>
> It's new words for me. I didn’t configure anything like this myself,
> only if the maintainers of my distribution enable it by default.
> I only know that Fedora Rawhide uses Pipewire for mixing sound.

Then it shouldn't be dmix/dsnoop. You'd better try profiling what's
going on there. e.g. try to get perf results with and without the
commit.


Takashi

2024-04-01 16:51:16

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

On Mon, Apr 1, 2024 at 7:48 PM Takashi Iwai <[email protected]> wrote:
>
>
> Then it shouldn't be dmix/dsnoop. You'd better try profiling what's
> going on there. e.g. try to get perf results with and without the
> commit.

perf diff perf-with-revert-587d67fd929a.data
perf-with-revert-587d67fd929a-beb45974dd49.data
No kallsyms or vmlinux with build-id
81ed128ee346c4f6a2595295d5b0f85ba85f770c was found
No kallsyms or vmlinux with build-id
5b7d215e519bfbc813e04bbb3850a08b8a8274c5 was found
# Event 'cycles:P'
#
# Baseline Delta Abs Shared Object
Symbol

>
# ........ ......... ..............................................
...................................................................................................................................................................>
#
44.80% +14.98% [kernel.kallsyms]
[k] 0xffffffff9f006bc0
43.34% -11.87% ShadowOfTheTombRaider
[.] 0x00000000001d9da1
1.67% -0.70% libc.so.6
[.] syscall
0.75% -0.21% libvulkan_radeon.so
[.] radv_UpdateDescriptorSetWithTemplate
0.42% -0.12% libvulkan_radeon.so
[.] radv_CmdBindPipeline
0.35% -0.12% libm.so.6
[.] __powf_fma
0.44% -0.11% libvulkan_radeon.so
[.] radv_bind_descriptor_sets
0.51% -0.11% libc.so.6
[.] __memmove_avx512_unaligned_erms
0.36% -0.11% libc.so.6
[.] __memset_avx512_unaligned_erms
0.35% -0.10% libvulkan_radeon.so
[.] radv_amdgpu_add_cs_to_bo_list
0.28% -0.07% libvulkan_radeon.so
[.] radv_emit_all_graphics_states
0.21% -0.07% libc.so.6
[.] sem_post@GLIBC_2.2.5
0.31% -0.07% libvulkan_radeon.so
[.] radv_AllocateDescriptorSets
0.20% -0.06% libvulkan_radeon.so
[.] radv_CmdDrawIndexed
0.21% -0.06% [vdso]
[.] 0x00000000000005a0
0.23% -0.06% [amdgpu]
[k] amdgpu_bo_placement_from_domain
0.19% -0.05% libvulkan_radeon.so
[.] radv_flush_constants
0.23% -0.05% [amdgpu]
[k] amdgpu_vram_mgr_compatible
0.18% -0.05% [amdgpu]
[k] amdgpu_vm_bo_update
0.12% -0.05% libc.so.6
[.] __futex_abstimed_wait_common
0.14% -0.04% libvulkan_radeon.so
[.] radv_emit_graphics_pipeline
0.16% -0.04% libvulkan_radeon.so
[.] radv_amdgpu_cs_find_buffer
0.14% -0.04% libm.so.6
[.] powf@GLIBC_2.2.5
0.18% -0.04% libvulkan_radeon.so
[.] radv_amdgpu_cs_add_buffer
0.14% -0.04% [amdgpu]
[k] amdgpu_bo_list_entry_cmp
0.15% -0.04% libvulkan_radeon.so
[.] radv_emit_all_inline_push_consts
0.08% -0.03% libc.so.6
[.] __new_sem_wait_slow64.constprop.0
0.10% -0.03% libvulkan_radeon.so
[.] radv_ResetDescriptorPool
0.11% -0.03% libvulkan_radeon.so
[.] radv_upload_graphics_shader_descriptors
+0.02% [JIT] tid 6669
[.] 0x000034d982fb9005
0.04% -0.02% [amdgpu]
[k] amdgpu_cs_ioctl
0.10% -0.02% libvulkan_radeon.so
[.] radv_bind_shader
0.00% +0.02% [crc32c_intel]
[k] crc_pcl
0.09% -0.02% libvulkan_radeon.so
[.] radv_emit_descriptor_pointers.isra.0
0.05% -0.02% libc.so.6
[.] sem_wait@@GLIBC_2.34
0.06% -0.02% libvulkan_radeon.so
[.] vk_common_CmdBindDescriptorSets
0.08% +0.02% libcef.so
[.] 0x0000000002470460
0.05% -0.02% libm.so.6
[.] __logf_fma

Here is original perf files:
[1] perf-with-revert-587d67fd929a.data -
https://mega.nz/file/EpZmXbgL#vEnxuODoB__jM6TwV6XbwC_TWMMqvT7qaWkJU22JY48
[2] perf-with-revert-587d67fd929a-beb45974dd49.data -
https://mega.nz/file/0lI31CDA#hKsv6vX7t5u-Sx1_p2E7-Y32-z5VFPlW2Y793oM-JRU

--
Best Regards,
Mike Gavrilov.

2024-04-01 19:29:21

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

On Mon, 2024-04-01 at 21:50 +0500, Mikhail Gavrilov wrote:
> On Mon, Apr 1, 2024 at 7:48 PM Takashi Iwai <[email protected]> wrote:
> >
> >
> > Then it shouldn't be dmix/dsnoop.  You'd better try profiling
> > what's
> > going on there.  e.g. try to get perf results with and without the
> > commit.
>
>
> Here is original perf files:
> [1] perf-with-revert-587d67fd929a.data -
> https://mega.nz/file/EpZmXbgL#vEnxuODoB__jM6TwV6XbwC_TWMMqvT7qaWkJU22JY48
> [2] perf-with-revert-587d67fd929a-beb45974dd49.data -
> https://mega.nz/file/0lI31CDA#hKsv6vX7t5u-Sx1_p2E7-Y32-z5VFPlW2Y793oM-JRU
>


--
Best Regards,
Mike Gavrilov.


Attachments:
perf-diff.zip (124.21 kB)

2024-04-18 06:31:52

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

On Tue, Apr 2, 2024 at 12:29 AM <[email protected]> wrote:
>
> On Mon, 2024-04-01 at 21:50 +0500, Mikhail Gavrilov wrote:
> > On Mon, Apr 1, 2024 at 7:48 PM Takashi Iwai <[email protected]> wrote:
> > >
> > >
> > > Then it shouldn't be dmix/dsnoop. You'd better try profiling
> > > what's
> > > going on there. e.g. try to get perf results with and without the
> > > commit.
> >
> >
> > Here is original perf files:
> > [1] perf-with-revert-587d67fd929a.data -
> > https://mega.nz/file/EpZmXbgL#vEnxuODoB__jM6TwV6XbwC_TWMMqvT7qaWkJU22JY48
> > [2] perf-with-revert-587d67fd929a-beb45974dd49.data -
> > https://mega.nz/file/0lI31CDA#hKsv6vX7t5u-Sx1_p2E7-Y32-z5VFPlW2Y793oM-JRU
> >

Excuse me. Can you have time to look in my perf files?

--
Best Regards,
Mike Gavrilov.

2024-04-24 13:22:03

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

On Thu, Apr 18, 2024 at 11:22 AM Mikhail Gavrilov
<[email protected]> wrote:
>
> On Tue, Apr 2, 2024 at 12:29 AM <[email protected]> wrote:
> >
> > On Mon, 2024-04-01 at 21:50 +0500, Mikhail Gavrilov wrote:
> > > On Mon, Apr 1, 2024 at 7:48 PM Takashi Iwai <[email protected]> wrote:
> > > >
> > > >
> > > > Then it shouldn't be dmix/dsnoop. You'd better try profiling
> > > > what's
> > > > going on there. e.g. try to get perf results with and without the
> > > > commit.
> > >
> > >
> > > Here is original perf files:
> > > [1] perf-with-revert-587d67fd929a.data -
> > > https://mega.nz/file/EpZmXbgL#vEnxuODoB__jM6TwV6XbwC_TWMMqvT7qaWkJU22JY48
> > > [2] perf-with-revert-587d67fd929a-beb45974dd49.data -
> > > https://mega.nz/file/0lI31CDA#hKsv6vX7t5u-Sx1_p2E7-Y32-z5VFPlW2Y793oM-JRU
> > >
>
> Excuse me. Can you have time to look in my perf files?
>

This is a friendly reminder

--
Best Regards,
Mike Gavrilov.

2024-04-24 13:36:35

by Takashi Iwai

[permalink] [raw]
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

On Wed, 24 Apr 2024 15:21:26 +0200,
Mikhail Gavrilov wrote:
>
> On Thu, Apr 18, 2024 at 11:22 AM Mikhail Gavrilov
> <[email protected]> wrote:
> >
> > On Tue, Apr 2, 2024 at 12:29 AM <[email protected]> wrote:
> > >
> > > On Mon, 2024-04-01 at 21:50 +0500, Mikhail Gavrilov wrote:
> > > > On Mon, Apr 1, 2024 at 7:48 PM Takashi Iwai <[email protected]> wrote:
> > > > >
> > > > >
> > > > > Then it shouldn't be dmix/dsnoop. You'd better try profiling
> > > > > what's
> > > > > going on there. e.g. try to get perf results with and without the
> > > > > commit.
> > > >
> > > >
> > > > Here is original perf files:
> > > > [1] perf-with-revert-587d67fd929a.data -
> > > > https://mega.nz/file/EpZmXbgL#vEnxuODoB__jM6TwV6XbwC_TWMMqvT7qaWkJU22JY48
> > > > [2] perf-with-revert-587d67fd929a-beb45974dd49.data -
> > > > https://mega.nz/file/0lI31CDA#hKsv6vX7t5u-Sx1_p2E7-Y32-z5VFPlW2Y793oM-JRU
> > > >
> >
> > Excuse me. Can you have time to look in my perf files?
> >
>
> This is a friendly reminder

Honestly speaking, it's your role to analyze what's going on.
Such a performance issue is pretty much dependent on the installed
system, so giving a perf data to me won't help much, unfortunately.


Takashi

2024-04-24 13:52:06

by Takashi Iwai

[permalink] [raw]
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

On Wed, 24 Apr 2024 15:33:10 +0200,
Takashi Iwai wrote:
>
> On Wed, 24 Apr 2024 15:21:26 +0200,
> Mikhail Gavrilov wrote:
> >
> > On Thu, Apr 18, 2024 at 11:22 AM Mikhail Gavrilov
> > <[email protected]> wrote:
> > >
> > > On Tue, Apr 2, 2024 at 12:29 AM <[email protected]> wrote:
> > > >
> > > > On Mon, 2024-04-01 at 21:50 +0500, Mikhail Gavrilov wrote:
> > > > > On Mon, Apr 1, 2024 at 7:48 PM Takashi Iwai <[email protected]> wrote:
> > > > > >
> > > > > >
> > > > > > Then it shouldn't be dmix/dsnoop. You'd better try profiling
> > > > > > what's
> > > > > > going on there. e.g. try to get perf results with and without the
> > > > > > commit.
> > > > >
> > > > >
> > > > > Here is original perf files:
> > > > > [1] perf-with-revert-587d67fd929a.data -
> > > > > https://mega.nz/file/EpZmXbgL#vEnxuODoB__jM6TwV6XbwC_TWMMqvT7qaWkJU22JY48
> > > > > [2] perf-with-revert-587d67fd929a-beb45974dd49.data -
> > > > > https://mega.nz/file/0lI31CDA#hKsv6vX7t5u-Sx1_p2E7-Y32-z5VFPlW2Y793oM-JRU
> > > > >
> > >
> > > Excuse me. Can you have time to look in my perf files?
> > >
> >
> > This is a friendly reminder
>
> Honestly speaking, it's your role to analyze what's going on.
> Such a performance issue is pretty much dependent on the installed
> system, so giving a perf data to me won't help much, unfortunately.

That said, maybe the first thing you can try would be to check who is
actually calling the corresponding function
(snd_timer_close_locked()). Put a debug print or a tracing hook to
watch out for figuring out. If the commit was really relevant, it
must be called very frequently and concurrently, and I don't know
really who does it except for dmix/dsnoop.


Takashi

2024-04-29 23:00:28

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

On Wed, Apr 24, 2024 at 6:42 PM Takashi Iwai <[email protected]> wrote:
>
> That said, maybe the first thing you can try would be to check who is
> actually calling the corresponding function
> (snd_timer_close_locked()). Put a debug print or a tracing hook to
> watch out for figuring out. If the commit was really relevant, it
> must be called very frequently and concurrently, and I don't know
> really who does it except for dmix/dsnoop.
>

It's madness.
Yes I added printk to snd_timer_close_locked and saw that function
invoked only 3 times when the system booted.
And never invoked during benchmark.

> git diff
diff --git a/Makefile b/Makefile
index 40fb2ca6fe4c..3ecff79a23b8 100644
--- a/Makefile
+++ b/Makefile
@@ -2,7 +2,7 @@
VERSION = 6
PATCHLEVEL = 9
SUBLEVEL = 0
-EXTRAVERSION = -rc6
+EXTRAVERSION = -rc6-test-build
NAME = Hurr durr I'ma ninja sloth

# *DOCUMENTATION*
diff --git a/sound/core/timer.c b/sound/core/timer.c
index 4d2ee99c12a3..59d8e4698b0b 100644
--- a/sound/core/timer.c
+++ b/sound/core/timer.c
@@ -407,7 +407,7 @@ static void snd_timer_close_locked(struct
snd_timer_instance *timeri,
struct device **card_devp_to_put)
{
struct snd_timer *timer = timeri->timer;
-
+ printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
if (timer) {
guard(spinlock_irq)(&timer->lock);
timeri->flags |= SNDRV_TIMER_IFLG_DEAD;

[Tue Apr 30 03:33:25 2024] igc 0000:0a:00.0 eno1: NIC Link is Up 1000
Mbps Full Duplex, Flow Control: RX/TX
[Tue Apr 30 03:33:33 2024] rfkill: input handler disabled
[Tue Apr 30 03:33:36 2024] DEBUG: Passed snd_timer_close_locked 410
[Tue Apr 30 03:33:36 2024] Bluetooth: RFCOMM TTY layer initialized
[Tue Apr 30 03:33:36 2024] Bluetooth: RFCOMM socket layer initialized
[Tue Apr 30 03:33:36 2024] Bluetooth: RFCOMM ver 1.11
[Tue Apr 30 03:33:51 2024] systemd-journald[949]:
/var/log/journal/1b6a399fb0874de8b095a739fe2ff323/user-1000.journal:
Journal file uses a different sequence number ID, rotating.
[Tue Apr 30 03:33:52 2024] rfkill: input handler enabled
[Tue Apr 30 03:33:55 2024] DEBUG: Passed snd_timer_close_locked 410
[Tue Apr 30 03:33:55 2024] rfkill: input handler disabled
[Tue Apr 30 03:33:57 2024] input: solaar-keyboard as
/devices/virtual/input/input18
[Tue Apr 30 03:34:08 2024] DEBUG: Passed snd_timer_close_locked 410
[Tue Apr 30 03:34:11 2024] input: Noble FoKus Mystique (AVRCP) as
/devices/virtual/input/input19
[Tue Apr 30 03:34:23 2024] show_signal: 8 callbacks suppressed
[Tue Apr 30 03:34:23 2024] traps: gldriverquery[4464] general
protection fault ip:7ff83958c76f sp:7ffc464e0e00 error:0 in
libLLVM.so.18.1[7ff83920c000+3afd000]
[Tue Apr 30 03:34:25 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 5 times, consider switching to WQ_UNBOUND
[Tue Apr 30 03:34:26 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 7 times, consider switching to WQ_UNBOUND
[Tue Apr 30 03:35:27 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 11 times, consider switching to WQ_UNBOUND
[Tue Apr 30 03:36:28 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 19 times, consider switching to WQ_UNBOUND
[Tue Apr 30 03:39:33 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 35 times, consider switching to WQ_UNBOUND

I saw a similar picture with ftrace.
I had no experience with ftrace, so I am laying out everything here
for you to see if I am doing something wrong or not.

# echo "snd_timer_close_locked" > /sys/kernel/tracing/set_ftrace_filter

# echo "function" > /sys/kernel/tracing/current_tracer

# echo 1 > /sys/kernel/tracing/tracing_on

/* Benchmark the game here */

# echo 0 > /sys/kernel/tracing/tracing_on

# cat /sys/kernel/tracing/trace
# tracer: function
#
# entries-in-buffer/entries-written: 0/0 #P:32
#
# _-----=> irqs-off/BH-disabled
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / _-=> migrate-disable
# |||| / delay
# TASK-PID CPU# ||||| TIMESTAMP FUNCTION
# | | | ||||| | |

# cat /sys/kernel/tracing/set_ftrace_filter
snd_timer_close_locked [snd_timer]

Now I am tormented by the question: how can reverting code that is not
invoked directly during a benchmark affect the benchmark result so
much?
I'm afraid that I won't be able to figure this out on my own.

--
Best Regards,
Mike Gavrilov.

2024-04-30 05:51:41

by Takashi Iwai

[permalink] [raw]
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

On Tue, 30 Apr 2024 01:00:07 +0200,
Mikhail Gavrilov wrote:
>
> On Wed, Apr 24, 2024 at 6:42 PM Takashi Iwai <[email protected]> wrote:
> >
> > That said, maybe the first thing you can try would be to check who is
> > actually calling the corresponding function
> > (snd_timer_close_locked()). Put a debug print or a tracing hook to
> > watch out for figuring out. If the commit was really relevant, it
> > must be called very frequently and concurrently, and I don't know
> > really who does it except for dmix/dsnoop.
> >
>
> It's madness.
> Yes I added printk to snd_timer_close_locked and saw that function
> invoked only 3 times when the system booted.
> And never invoked during benchmark.
>
> > git diff
> diff --git a/Makefile b/Makefile
> index 40fb2ca6fe4c..3ecff79a23b8 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -2,7 +2,7 @@
> VERSION = 6
> PATCHLEVEL = 9
> SUBLEVEL = 0
> -EXTRAVERSION = -rc6
> +EXTRAVERSION = -rc6-test-build
> NAME = Hurr durr I'ma ninja sloth
>
> # *DOCUMENTATION*
> diff --git a/sound/core/timer.c b/sound/core/timer.c
> index 4d2ee99c12a3..59d8e4698b0b 100644
> --- a/sound/core/timer.c
> +++ b/sound/core/timer.c
> @@ -407,7 +407,7 @@ static void snd_timer_close_locked(struct
> snd_timer_instance *timeri,
> struct device **card_devp_to_put)
> {
> struct snd_timer *timer = timeri->timer;
> -
> + printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
> if (timer) {
> guard(spinlock_irq)(&timer->lock);
> timeri->flags |= SNDRV_TIMER_IFLG_DEAD;
>
> [Tue Apr 30 03:33:25 2024] igc 0000:0a:00.0 eno1: NIC Link is Up 1000
> Mbps Full Duplex, Flow Control: RX/TX
> [Tue Apr 30 03:33:33 2024] rfkill: input handler disabled
> [Tue Apr 30 03:33:36 2024] DEBUG: Passed snd_timer_close_locked 410
> [Tue Apr 30 03:33:36 2024] Bluetooth: RFCOMM TTY layer initialized
> [Tue Apr 30 03:33:36 2024] Bluetooth: RFCOMM socket layer initialized
> [Tue Apr 30 03:33:36 2024] Bluetooth: RFCOMM ver 1.11
> [Tue Apr 30 03:33:51 2024] systemd-journald[949]:
> /var/log/journal/1b6a399fb0874de8b095a739fe2ff323/user-1000.journal:
> Journal file uses a different sequence number ID, rotating.
> [Tue Apr 30 03:33:52 2024] rfkill: input handler enabled
> [Tue Apr 30 03:33:55 2024] DEBUG: Passed snd_timer_close_locked 410
> [Tue Apr 30 03:33:55 2024] rfkill: input handler disabled
> [Tue Apr 30 03:33:57 2024] input: solaar-keyboard as
> /devices/virtual/input/input18
> [Tue Apr 30 03:34:08 2024] DEBUG: Passed snd_timer_close_locked 410
> [Tue Apr 30 03:34:11 2024] input: Noble FoKus Mystique (AVRCP) as
> /devices/virtual/input/input19
> [Tue Apr 30 03:34:23 2024] show_signal: 8 callbacks suppressed
> [Tue Apr 30 03:34:23 2024] traps: gldriverquery[4464] general
> protection fault ip:7ff83958c76f sp:7ffc464e0e00 error:0 in
> libLLVM.so.18.1[7ff83920c000+3afd000]
> [Tue Apr 30 03:34:25 2024] workqueue: gc_worker [nf_conntrack] hogged
> CPU for >10000us 5 times, consider switching to WQ_UNBOUND
> [Tue Apr 30 03:34:26 2024] workqueue: gc_worker [nf_conntrack] hogged
> CPU for >10000us 7 times, consider switching to WQ_UNBOUND
> [Tue Apr 30 03:35:27 2024] workqueue: gc_worker [nf_conntrack] hogged
> CPU for >10000us 11 times, consider switching to WQ_UNBOUND
> [Tue Apr 30 03:36:28 2024] workqueue: gc_worker [nf_conntrack] hogged
> CPU for >10000us 19 times, consider switching to WQ_UNBOUND
> [Tue Apr 30 03:39:33 2024] workqueue: gc_worker [nf_conntrack] hogged
> CPU for >10000us 35 times, consider switching to WQ_UNBOUND
>
> I saw a similar picture with ftrace.
> I had no experience with ftrace, so I am laying out everything here
> for you to see if I am doing something wrong or not.
>
> # echo "snd_timer_close_locked" > /sys/kernel/tracing/set_ftrace_filter
>
> # echo "function" > /sys/kernel/tracing/current_tracer
>
> # echo 1 > /sys/kernel/tracing/tracing_on
>
> /* Benchmark the game here */
>
> # echo 0 > /sys/kernel/tracing/tracing_on
>
> # cat /sys/kernel/tracing/trace
> # tracer: function
> #
> # entries-in-buffer/entries-written: 0/0 #P:32
> #
> # _-----=> irqs-off/BH-disabled
> # / _----=> need-resched
> # | / _---=> hardirq/softirq
> # || / _--=> preempt-depth
> # ||| / _-=> migrate-disable
> # |||| / delay
> # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
> # | | | ||||| | |
>
> # cat /sys/kernel/tracing/set_ftrace_filter
> snd_timer_close_locked [snd_timer]
>
> Now I am tormented by the question: how can reverting code that is not
> invoked directly during a benchmark affect the benchmark result so
> much?

It implies that it's not about the revert that influences on the
result. It might be rather the fact that you built the kernel by
yourself. Try to build the kernel again with a debug print but
without the revert of the spinlock. Does it still show the same
improvement? If so, try again without the debug print and retest.


Takashi

2024-04-30 10:56:26

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

On Tue, Apr 30, 2024 at 10:50 AM Takashi Iwai <[email protected]> wrote:
>
> It implies that it's not about the revert that influences on the
> result. It might be rather the fact that you built the kernel by
> yourself. Try to build the kernel again with a debug print but
> without the revert of the spinlock.

It was built without reverting the spinlock.
With revert in the kernel log appears "WARNING: possible irq lock
inversion dependency detected"
I suppose commit 587d67fd929a fixes just this.
About calls of snd_timer_close_locked, I again see only 3 invokes when
the system booted:
[Tue Apr 30 15:09:40 2024] igc 0000:0a:00.0 eno1: NIC Link is Up 1000
Mbps Full Duplex, Flow Control: RX/TX
[Tue Apr 30 15:09:42 2024] input: Denon PerL Pro 772 (AVRCP) as
/devices/virtual/input/input18
[Tue Apr 30 15:09:45 2024] rfkill: input handler disabled
[Tue Apr 30 15:09:49 2024] Bluetooth: RFCOMM TTY layer initialized
[Tue Apr 30 15:09:49 2024] Bluetooth: RFCOMM socket layer initialized
[Tue Apr 30 15:09:49 2024] Bluetooth: RFCOMM ver 1.11
[Tue Apr 30 15:09:49 2024] DEBUG: Passed snd_timer_close_locked 410
[Tue Apr 30 15:09:51 2024] systemd-journald[947]:
/var/log/journal/1b6a399fb0874de8b095a739fe2ff323/user-1000.journal:
Journal file uses a different sequence number ID, rotating.
[Tue Apr 30 15:09:52 2024] rfkill: input handler enabled
[Tue Apr 30 15:09:55 2024] DEBUG: Passed snd_timer_close_locked 410
[Tue Apr 30 15:09:55 2024] rfkill: input handler disabled
[Tue Apr 30 15:09:55 2024] input: Denon PerL Pro 772 (AVRCP) as
/devices/virtual/input/input19
[Tue Apr 30 15:09:57 2024] input: solaar-keyboard as
/devices/virtual/input/input20
[Tue Apr 30 15:10:08 2024] DEBUG: Passed snd_timer_close_locked 410
[Tue Apr 30 15:10:23 2024] show_signal_msg: 8 callbacks suppressed
[Tue Apr 30 15:10:23 2024] gldriverquery[4423]: segfault at 0 ip
0000000000000000 sp 00007fff2ee18c38 error 14 in
gldriverquery[55c135c00000+14000] likely on CPU 11 (core 11, socket 0)
[Tue Apr 30 15:10:23 2024] Code: Unable to access opcode bytes at
0xffffffffffffffd6.
[Tue Apr 30 15:10:40 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 7 times, consider switching to WQ_UNBOUND
[Tue Apr 30 15:10:40 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 11 times, consider switching to WQ_UNBOUND

And no invokes during benchmark.
root@primary-ws ~# cat /sys/kernel/tracing/trace
# tracer: function
#
# entries-in-buffer/entries-written: 0/0 #P:32
#
# _-----=> irqs-off/BH-disabled
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / _-=> migrate-disable
# |||| / delay
# TASK-PID CPU# ||||| TIMESTAMP FUNCTION
# | | | ||||| | |

I attached here two kernel logs both with printk debug. One is clean
6.9.0-rc6 the other 6.9.0-rc6 with revert 587d67fd929a.
I didn't see any difference between kernel logs.
I only see that without 587d67fd929a at some point a possible irq lock
inversion dependency is detected and the benchmark improves results up
to 30%.

--
Best Regards,
Mike Gavrilov.


Attachments:
dmesg-test-build.zip (49.13 kB)
dmesg-test-build-revert-587d67fd929a.zip (51.91 kB)
Download all attachments

2024-04-30 12:22:17

by Takashi Iwai

[permalink] [raw]
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

On Tue, 30 Apr 2024 12:54:36 +0200,
Mikhail Gavrilov wrote:
>
> On Tue, Apr 30, 2024 at 10:50 AM Takashi Iwai <[email protected]> wrote:
> >
> > It implies that it's not about the revert that influences on the
> > result. It might be rather the fact that you built the kernel by
> > yourself. Try to build the kernel again with a debug print but
> > without the revert of the spinlock.
>
> It was built without reverting the spinlock.
> With revert in the kernel log appears "WARNING: possible irq lock
> inversion dependency detected"

Where and how...? That important piece is missing in you info...
And, it's likely a false-positive detection by lockdep.

When you disable lockdep, does it still show any performance down?


Takashi

> I suppose commit 587d67fd929a fixes just this.
> About calls of snd_timer_close_locked, I again see only 3 invokes when
> the system booted:
> [Tue Apr 30 15:09:40 2024] igc 0000:0a:00.0 eno1: NIC Link is Up 1000
> Mbps Full Duplex, Flow Control: RX/TX
> [Tue Apr 30 15:09:42 2024] input: Denon PerL Pro 772 (AVRCP) as
> /devices/virtual/input/input18
> [Tue Apr 30 15:09:45 2024] rfkill: input handler disabled
> [Tue Apr 30 15:09:49 2024] Bluetooth: RFCOMM TTY layer initialized
> [Tue Apr 30 15:09:49 2024] Bluetooth: RFCOMM socket layer initialized
> [Tue Apr 30 15:09:49 2024] Bluetooth: RFCOMM ver 1.11
> [Tue Apr 30 15:09:49 2024] DEBUG: Passed snd_timer_close_locked 410
> [Tue Apr 30 15:09:51 2024] systemd-journald[947]:
> /var/log/journal/1b6a399fb0874de8b095a739fe2ff323/user-1000.journal:
> Journal file uses a different sequence number ID, rotating.
> [Tue Apr 30 15:09:52 2024] rfkill: input handler enabled
> [Tue Apr 30 15:09:55 2024] DEBUG: Passed snd_timer_close_locked 410
> [Tue Apr 30 15:09:55 2024] rfkill: input handler disabled
> [Tue Apr 30 15:09:55 2024] input: Denon PerL Pro 772 (AVRCP) as
> /devices/virtual/input/input19
> [Tue Apr 30 15:09:57 2024] input: solaar-keyboard as
> /devices/virtual/input/input20
> [Tue Apr 30 15:10:08 2024] DEBUG: Passed snd_timer_close_locked 410
> [Tue Apr 30 15:10:23 2024] show_signal_msg: 8 callbacks suppressed
> [Tue Apr 30 15:10:23 2024] gldriverquery[4423]: segfault at 0 ip
> 0000000000000000 sp 00007fff2ee18c38 error 14 in
> gldriverquery[55c135c00000+14000] likely on CPU 11 (core 11, socket 0)
> [Tue Apr 30 15:10:23 2024] Code: Unable to access opcode bytes at
> 0xffffffffffffffd6.
> [Tue Apr 30 15:10:40 2024] workqueue: gc_worker [nf_conntrack] hogged
> CPU for >10000us 7 times, consider switching to WQ_UNBOUND
> [Tue Apr 30 15:10:40 2024] workqueue: gc_worker [nf_conntrack] hogged
> CPU for >10000us 11 times, consider switching to WQ_UNBOUND
>
> And no invokes during benchmark.
> root@primary-ws ~# cat /sys/kernel/tracing/trace
> # tracer: function
> #
> # entries-in-buffer/entries-written: 0/0 #P:32
> #
> # _-----=> irqs-off/BH-disabled
> # / _----=> need-resched
> # | / _---=> hardirq/softirq
> # || / _--=> preempt-depth
> # ||| / _-=> migrate-disable
> # |||| / delay
> # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
> # | | | ||||| | |
>
> I attached here two kernel logs both with printk debug. One is clean
> 6.9.0-rc6 the other 6.9.0-rc6 with revert 587d67fd929a.
> I didn't see any difference between kernel logs.
> I only see that without 587d67fd929a at some point a possible irq lock
> inversion dependency is detected and the benchmark improves results up
> to 30%.
>
> --
> Best Regards,
> Mike Gavrilov.
>
>

2024-04-30 14:45:44

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

On Tue, Apr 30, 2024 at 5:12 PM Takashi Iwai <[email protected]> wrote:
> Where and how...? That important piece is missing in you info...
> And, it's likely a false-positive detection by lockdep.
>
> When you disable lockdep, does it still show any performance down?

Of course without "lockdep" performance is even better than the commit
587d67fd929a just reverted.
- mainline kernel without changes with lockdep - 13244 [1]
- mainline kernel with reverted 587d67fd929a with lockdep - 19218 [2]
- mainline kernel without changes without lockdep - 22440 [3]

I think I get what you're saying: after a false-positive detection,
lockdep is partially disabled. Yes?
How do I understand that lockdep has partially turned off?
I would expect to see such a message in the kernel log " INFO: lockdep
is turned off."
But in my case this message didn't appear.
In any case, sorry for wasting your time.

[1] https://postimg.cc/G4NPHMyk
[2] https://postimg.cc/1n0D7sYH
[3] https://postimg.cc/nMYHWRcF

--
Best Regards,
Mike Gavrilov.