LinuxLists.cc - Re: This is the fourth time I've tried to find what led to the regression of outgoing network speed and each time I find the merge commit 8c94ccc7cd691472461448f98e2372c75849406c

2024-02-26 10:16:08

Subject: Re: This is the fourth time I've tried to find what led to the regression of outgoing network speed and each time I find the merge commit 8c94ccc7cd691472461448f98e2372c75849406c

On 26.02.24 10:24, Mathias Nyman wrote:
> On 26.2.2024 7.45, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 21.02.24 14:44, Mathias Nyman wrote:
>>> On 21.2.2024 1.43, Randy Dunlap wrote:
>>>> On 2/20/24 15:41, Randy Dunlap wrote:
>>>>> {+ tglx]
>>>>> On 2/20/24 15:19, Mikhail Gavrilov wrote:
>>>>>> On Mon, Feb 19, 2024 at 2:41 PM Mikhail Gavrilov
>>>>>> <[email protected]> wrote:
>>>>>> I spotted network performance regression and it turned out, this was
>>>>>> due to the network card getting other interrupt. It is a side effect
>>>>>> of commit 57e153dfd0e7a080373fe5853c5609443d97fa5a.
>>>>> That's a merge commit (AFAIK, maybe not so much). The commit in
>>>>> mainline is:
>>>>>
>>>>> commit f977f4c9301c
>>>>> Author: Niklas Neronin <[email protected]>
>>>>> Date: Fri Dec 1 17:06:40 2023 +0200
>>>>>
>>>>> xhci: add handler for only one interrupt line
>>>>>
>>>>>> Installing irqbalance daemon did not help. Maybe someone experienced
>>>>>> such a problem?
>>>>>
>>>>> Thomas, would you look at this, please?
>>>>>
>>>>> A network device and xhci (USB) driver are now sharing interrupts.
>>>>> This causes a large performance decrease for the networking device.
>>>
>>> Short recap:
>>
>> Thx for that. As the 6.8 release is merely two or three weeks away while
>> a fix is nowhere near in sight yet (afaics!) I start to wonder if we
>> should consider a revert here and try reapplying the culprit in a later
>> cycle when this problem is fixed.

Thx for the reply.

> I don't think reverting this series is a solution.
>
> This isn't really about those usb xhci patches.
> This is about which interrupt gets assigned to which CPU.

I know, but from my understanding of Linus expectations wrt to handling
regressions it does not matter much if a bug existed earlier or
somewhere else: what counts is the commit that exposed the problem.

But I might be wrong here. Anyway, not CCing Linus for this; but I'll
likely point him to this direction on Sunday in my next weekly report,
unless some fix comes into sight.

> Mikhail got unlucky when the network adapter interrupts on that system was
> assigned to CPU0, clearly a more "clogged" CPU, thus causing a drop in max
> bandwidth.

But maybe others will be just as "unlucky". Or is there anything to
believe otherwise? Maybe some aspect of the .config or local setup that
is most likely unique to Mikhail's setup?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

2024-02-26 10:56:00

by Mathias Nyman

[permalink] [raw]

Subject: Re: This is the fourth time I've tried to find what led to the regression of outgoing network speed and each time I find the merge commit 8c94ccc7cd691472461448f98e2372c75849406c

On 26.2.2024 11.51, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 26.02.24 10:24, Mathias Nyman wrote:
>> On 26.2.2024 7.45, Linux regression tracking (Thorsten Leemhuis) wrote:
>>> On 21.02.24 14:44, Mathias Nyman wrote:
>>>> On 21.2.2024 1.43, Randy Dunlap wrote:
>>>>> On 2/20/24 15:41, Randy Dunlap wrote:
>>>>>> {+ tglx]
>>>>>> On 2/20/24 15:19, Mikhail Gavrilov wrote:
>>>>>>> On Mon, Feb 19, 2024 at 2:41 PM Mikhail Gavrilov
>>>>>>> <[email protected]> wrote:
>>>>>>> I spotted network performance regression and it turned out, this was
>>>>>>> due to the network card getting other interrupt. It is a side effect
>>>>>>> of commit 57e153dfd0e7a080373fe5853c5609443d97fa5a.
>>>>>> That's a merge commit (AFAIK, maybe not so much). The commit in
>>>>>> mainline is:
>>>>>>
>>>>>> commit f977f4c9301c
>>>>>> Author: Niklas Neronin <[email protected]>
>>>>>> Date: Fri Dec 1 17:06:40 2023 +0200
>>>>>>
>>>>>> xhci: add handler for only one interrupt line
>>>>>>
>>>>>>> Installing irqbalance daemon did not help. Maybe someone experienced
>>>>>>> such a problem?
>>>>>>
>>>>>> Thomas, would you look at this, please?
>>>>>>
>>>>>> A network device and xhci (USB) driver are now sharing interrupts.
>>>>>> This causes a large performance decrease for the networking device.
>>>>
>>>> Short recap:
>>>
>>> Thx for that. As the 6.8 release is merely two or three weeks away while
>>> a fix is nowhere near in sight yet (afaics!) I start to wonder if we
>>> should consider a revert here and try reapplying the culprit in a later
>>> cycle when this problem is fixed.
>
> Thx for the reply.
>
>> I don't think reverting this series is a solution.
>>
>> This isn't really about those usb xhci patches.
>> This is about which interrupt gets assigned to which CPU.
>
> I know, but from my understanding of Linus expectations wrt to handling
> regressions it does not matter much if a bug existed earlier or
> somewhere else: what counts is the commit that exposed the problem.
>
> But I might be wrong here. Anyway, not CCing Linus for this; but I'll
> likely point him to this direction on Sunday in my next weekly report,
> unless some fix comes into sight.
>
>> Mikhail got unlucky when the network adapter interrupts on that system was
>> assigned to CPU0, clearly a more "clogged" CPU, thus causing a drop in max
>> bandwidth.
>
> But maybe others will be just as "unlucky". Or is there anything to
> believe otherwise? Maybe some aspect of the .config or local setup that
> is most likely unique to Mikhail's setup?

I believe this is a zero-sum case.

Others got equally lucky due to this change.
Their devices end up interrupting less clogged CPUs and see a similar
performance increase.

Thanks
Mathias

2024-02-26 18:39:47

by Thomas Gleixner

[permalink] [raw]

Subject: Re: This is the fourth time I've tried to find what led to the regression of outgoing network speed and each time I find the merge commit 8c94ccc7cd691472461448f98e2372c75849406c

On Mon, Feb 26 2024 at 12:54, Mathias Nyman wrote:
> On 26.2.2024 11.51, Linux regression tracking (Thorsten Leemhuis) wrote:
>>> I don't think reverting this series is a solution.
>>>
>>> This isn't really about those usb xhci patches.
>>> This is about which interrupt gets assigned to which CPU.
>>
>> I know, but from my understanding of Linus expectations wrt to handling
>> regressions it does not matter much if a bug existed earlier or
>> somewhere else: what counts is the commit that exposed the problem.
>>
>> But I might be wrong here. Anyway, not CCing Linus for this; but I'll
>> likely point him to this direction on Sunday in my next weekly report,
>> unless some fix comes into sight.
>>
>>> Mikhail got unlucky when the network adapter interrupts on that system was
>>> assigned to CPU0, clearly a more "clogged" CPU, thus causing a drop in max
>>> bandwidth.
>>
>> But maybe others will be just as "unlucky". Or is there anything to
>> believe otherwise? Maybe some aspect of the .config or local setup that
>> is most likely unique to Mikhail's setup?
>
> I believe this is a zero-sum case.
>
> Others got equally lucky due to this change.
> Their devices end up interrupting less clogged CPUs and see a similar
> performance increase.

Reverting this does not make any sense.

The kernel assigns the initial interrupt affinities to the CPUs so that
the number of interrupts is halfways balanced. That spreading algorithm
is completely agnostic of the actual usage of the interrupts. Where
e.g. the network interrupt ends up depends on the probe/enumeration
order of devices. Add another PCI-E card into the machine and it will
again look different.

There is nothing the kernel can do about it and earlier attempts to do
interrupt frequency based balancing in the kernel ended up nowhere
simply because the kernel does not have enough information about the
overall requirements. That's why the kernel leaves the affinity
configuration for user space, e.g. irqbalanced, except for true
multi-queue scenarios like NVME where the kernel binds queues and their
interrupts to specific CPUs or groups of CPUs.

Why ending up on CPU0 has this particular effect on Mikhails machine is
unclear as we don't have any information about the overall workload,
other interrupt sources on CPU0 and their frequency. That'd need to be
investigated with instrumentation and might unearth some completely
different underlying reason causing this behavior.

So I don't think this is a regression in the true sense of
regressions. It's an unfortunate coincidence and reverting the
identified commits would just paper over the real problem, if there is
actually one single source of trouble which causes the performance drop
only on CPU0. The commits are definitely _not_ the root cause, they
happen to unearth some other issue, which might be as mundane as
e.g. that the NVME interrupt on CPU0 is competing with the network
interrupt. So don't shoot the messenger.

Thanks,

tglx

2024-02-27 17:34:22

by Mikhail Gavrilov

[permalink] [raw]

Subject: Re: This is the fourth time I've tried to find what led to the regression of outgoing network speed and each time I find the merge commit 8c94ccc7cd691472461448f98e2372c75849406c

On Mon, 2024-02-26 at 19:09 +0100, Thomas Gleixner wrote:
> we don't have any information about the overall workload,

During measurements nothing was running except iperf3

> other interrupt sources on CPU0 and their frequency. That'd need to
> be investigated with instrumentation and might unearth some
> completely different underlying reason causing this behavior.

I made simple bash script for benchmark enp14s0 on each of CPU core.

#!/usr/bin/env bash
for i in {0..31}
do
smp_affinity=$(echo 'obase=16; '$((2 ** i)) | bc)
echo "echo $smp_affinity > /proc/irq/84/smp_affinity"
echo $smp_affinity > /proc/irq/84/smp_affinity
echo 'iperf3 -c primary-ws.local -t 5 -p 5000 -P 1'
iperf3 -c primary-ws.local -t 5 -p 5000 -P 1
done

And attach here results of iperf3 for kernels 6.7.0 and 6.8.0-rc6.
Which once again makes sure that CPU0 is a bad option in both cases.
And any other CPU does not necessarily 23 allow the network interface
to operate at the limit of the capabilities of the network cable.

I also attach /proc/interrupts I hope this helps clear things up.

I don't know how else to help you. What information to provide.

About repeatability my "unlucky" scenario.
I have two MSI MPG B650I EDGE WIFI motherboards and this problem
happened both at the same time.

It seems the problem has always been there, we just never noticed it.

--
Best Regards,
Mike Gavrilov.

Attachments:

benchmarking-6.7.0-all-cores.zip (2.10 kB)
benchmarking-6.8.0-0.rc6-all-cores.zip (2.23 kB)
proc-interrupts.zip (2.81 kB)
Download all attachments

2024-02-27 18:05:08

by Thomas Gleixner

[permalink] [raw]

Subject: Re: This is the fourth time I've tried to find what led to the regression of outgoing network speed and each time I find the merge commit 8c94ccc7cd691472461448f98e2372c75849406c

On Tue, Feb 27 2024 at 22:08, [email protected] wrote:
> On Mon, 2024-02-26 at 19:09 +0100, Thomas Gleixner wrote:
>> we don't have any information about the overall workload,
>
> During measurements nothing was running except iperf3

Ok.

> I don't know how else to help you. What information to provide.

If we want to understand why CPU0 is problematic, then you need to use
tracing to capture what's going on on CPU0 vs. other CPUs.

> About repeatability my "unlucky" scenario.
> I have two MSI MPG B650I EDGE WIFI motherboards and this problem
> happened both at the same time.

Sure. The probe order and the number of interrupts are probably exactly
the same. As the spreading algorithm is very basic, it will result in
exactly the same setup for both.

> It seems the problem has always been there, we just never noticed it.

Exactly.

Thanks,

tglx

2024-02-27 18:44:36

by Mikhail Gavrilov

[permalink] [raw]

Subject: Re: This is the fourth time I've tried to find what led to the regression of outgoing network speed and each time I find the merge commit 8c94ccc7cd691472461448f98e2372c75849406c

On Tue, 2024-02-27 at 18:23 +0100, Thomas Gleixner wrote:
> If we want to understand why CPU0 is problematic, then you need to
> use tracing to capture what's going on on CPU0 vs. other CPUs.

I am not hear what kind of profiler software you prefer.
I famous with sysprof and attach here captures for both cases CPU0 vs
CPU23. I hope this helps clear things up.

Thanks!

--
Best Regards,
Mike Gavrilov.

Attachments:

capture-CPU0.zip (3.42 MB)
capture_CPU23.zip (3.41 MB)
Download all attachments

2024-02-29 09:42:24

by Mikhail Gavrilov

[permalink] [raw]

Subject: Re: This is the fourth time I've tried to find what led to the regression of outgoing network speed and each time I find the merge commit 8c94ccc7cd691472461448f98e2372c75849406c

On Tue, Feb 27, 2024 at 11:03 PM <[email protected]> wrote:
>
> On Tue, 2024-02-27 at 18:23 +0100, Thomas Gleixner wrote:
> > If we want to understand why CPU0 is problematic, then you need to
> > use tracing to capture what's going on on CPU0 vs. other CPUs.
>
> I am not hear what kind of profiler software you prefer.
> I famous with sysprof and attach here captures for both cases CPU0 vs
> CPU23. I hope this helps clear things up.
>

Sorry for the noise.
Because I am unsure whether you received or not my previous message
with captures.
I upload them to the mega file exchange server and share links below.
capture CPU0: https://mega.nz/file/Ik5XiZAS#Hra7Xtzplp8xcHYFj4JXnpp8T-0UA0nhNSIJJLEcSBk
capture CPU23: https://mega.nz/file/swg0CQ4C#PvGv_WXmtnATD7tNun5xz-lfA5GGqA-KOv1ZbVRJ_lI

--
Best Regards,
Mike Gavrilov.

2024-03-04 14:11:20

by Thorsten Leemhuis

[permalink] [raw]

Subject: Re: This is the fourth time I've tried to find what led to the regression of outgoing network speed and each time I find the merge commit 8c94ccc7cd691472461448f98e2372c75849406c

On 26.02.24 10:51, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 26.02.24 10:24, Mathias Nyman wrote:
>> On 26.2.2024 7.45, Linux regression tracking (Thorsten Leemhuis) wrote:
>>> On 21.02.24 14:44, Mathias Nyman wrote:
>>>> On 21.2.2024 1.43, Randy Dunlap wrote:
>>>>> On 2/20/24 15:41, Randy Dunlap wrote:
>>>>>> {+ tglx]
>>>>>> On 2/20/24 15:19, Mikhail Gavrilov wrote:
>>>>>>> On Mon, Feb 19, 2024 at 2:41 PM Mikhail Gavrilov
>>>>>>> <[email protected]> wrote:
>>>>>>> I spotted network performance regression and it turned out, this was
>>>>>>> due to the network card getting other interrupt. It is a side effect
>>>>>>> of commit 57e153dfd0e7a080373fe5853c5609443d97fa5a.
>>>>>> That's a merge commit (AFAIK, maybe not so much). The commit in
>>>>>> mainline is:
>>>>>>
>>>>>> commit f977f4c9301c
>>>>>> Author: Niklas Neronin <[email protected]>
>>>>>> Date: Fri Dec 1 17:06:40 2023 +0200
>>>>>>
>>>>>> xhci: add handler for only one interrupt line
>>>>>>
>>>>>>> Installing irqbalance daemon did not help. Maybe someone experienced
>>>>>>> such a problem?
>> This isn't really about those usb xhci patches.
>> This is about which interrupt gets assigned to which CPU.
> I know, but from my understanding of Linus expectations wrt to handling
> regressions it does not matter much if a bug existed earlier or
> somewhere else: what counts is the commit that exposed the problem.

TWIMC, I mentioned this twice in mails to Linus, he didn't get involved,
so I assume things are fine the way they are for him. And then it's of
course totally fine for me, too. :-D

Thx again for all your help and sorry for causing trouble, but in my
line of work these "might or might not be a regression from Linus
viewpoint" sometimes happen.

Ciao, Thorsten

#regzbot resolve: apparently not a regression from Linus viewpoint