Hello Qian and Greg
With latest 5.6.x kernel have problem with events_power_efficient 28488 root 28 0 0 0 0 R 95.5 0.0 101:38.19 kworker/2:1+events_power_efficient Process start to load machine after 3-4 hour and load not stop only reboot machine remove process . Server runing on AMD EPIC CPU 2x 7301 32Gb Ram Have 2 x 10G card Intel when machine load over 1G traffic machine locked and only restart fix problem to next load . After move traffic and server stop load process still hear and load server ?
And after reboot process move to other core .
Best Regards,
Martin
the problem is hear with kernel 5.7.7
last work kernel without this problem is 5.6.7
hear is more info:
cat /proc/57259/stack
root@megacableamarilis:~# cat /proc/57259/stack
[<0>] gc_worker+0x1be/0x380 [nf_conntrack]
[<0>] process_one_work+0x1bc/0x3b0
[<0>] worker_thread+0x4d/0x460
[<0>] kthread+0x10d/0x130
[<0>] ret_from_fork+0x1f/0x30
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 57259 root 28 0 0 0 0 R 69.8 0.0 82:42.14 kworker/5:2+events_power_efficient
32 root 21 0 0 0 0 R 31.0 0.0 87:06.33 ksoftirqd/4
Please help to fix this problem
> On 22 Apr 2020, at 15:55, Martin Zaharinov <[email protected]> wrote:
>
> Hello Qian and Greg
> With latest 5.6.x kernel have problem with events_power_efficient 28488 root 28 0 0 0 0 R 95.5 0.0 101:38.19 kworker/2:1+events_power_efficient Process start to load machine after 3-4 hour and load not stop only reboot machine remove process . Server runing on AMD EPIC CPU 2x 7301 32Gb Ram Have 2 x 10G card Intel when machine load over 1G traffic machine locked and only restart fix problem to next load . After move traffic and server stop load process still hear and load server ?
> And after reboot process move to other core .
>
> Best Regards,
> Martin
And this is log from /sys/kernel/debug/tracing/trace
# entries-in-buffer/entries-written: 32410/32410 #P:64
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<...>-57259 [005] .... 29619.680698: workqueue_execute_start: work struct 00000000ef22e4b8: function gc_worker [nf_conntrack]
<...>-57259 [005] .... 29623.811407: workqueue_execute_end: work struct 00000000ef22e4b8: function gc_worker [nf_conntrack]
<...>-57259 [005] .... 29623.811410: workqueue_execute_start: work struct 0000000000aeec55: function fb_flashcursor
<...>-57259 [005] .... 29623.811421: workqueue_execute_end: work struct 0000000000aeec55: function fb_flashcursor
<...>-57259 [005] .... 29623.811422: workqueue_execute_start: work struct 00000000a6d382bb: function vmstat_update
<...>-57259 [005] .... 29623.811435: workqueue_execute_end: work struct 00000000a6d382bb: function vmstat_update
> On 7 Jul 2020, at 22:44, Martin Zaharinov <[email protected]> wrote:
>
> the problem is hear with kernel 5.7.7
>
> last work kernel without this problem is 5.6.7
>
> hear is more info:
>
> cat /proc/57259/stack
> root@megacableamarilis:~# cat /proc/57259/stack
> [<0>] gc_worker+0x1be/0x380 [nf_conntrack]
> [<0>] process_one_work+0x1bc/0x3b0
> [<0>] worker_thread+0x4d/0x460
> [<0>] kthread+0x10d/0x130
> [<0>] ret_from_fork+0x1f/0x30
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 57259 root 28 0 0 0 0 R 69.8 0.0 82:42.14 kworker/5:2+events_power_efficient
> 32 root 21 0 0 0 0 R 31.0 0.0 87:06.33 ksoftirqd/4
>
>
> Please help to fix this problem
>
>> On 22 Apr 2020, at 15:55, Martin Zaharinov <[email protected]> wrote:
>>
>> Hello Qian and Greg
>> With latest 5.6.x kernel have problem with events_power_efficient 28488 root 28 0 0 0 0 R 95.5 0.0 101:38.19 kworker/2:1+events_power_efficient Process start to load machine after 3-4 hour and load not stop only reboot machine remove process . Server runing on AMD EPIC CPU 2x 7301 32Gb Ram Have 2 x 10G card Intel when machine load over 1G traffic machine locked and only restart fix problem to next load . After move traffic and server stop load process still hear and load server ?
>> And after reboot process move to other core .
>>
>> Best Regards,
>> Martin
>
Add Greg , Florian, Eric to this bug
> On 7 Jul 2020, at 22:54, Martin Zaharinov <[email protected]> wrote:
>
> And this is log from /sys/kernel/debug/tracing/trace
>
>
> # entries-in-buffer/entries-written: 32410/32410 #P:64
> #
> # _-----=> irqs-off
> # / _----=> need-resched
> # | / _---=> hardirq/softirq
> # || / _--=> preempt-depth
> # ||| / delay
> # TASK-PID CPU# |||| TIMESTAMP FUNCTION
> # | | | |||| | |
> <...>-57259 [005] .... 29619.680698: workqueue_execute_start: work struct 00000000ef22e4b8: function gc_worker [nf_conntrack]
> <...>-57259 [005] .... 29623.811407: workqueue_execute_end: work struct 00000000ef22e4b8: function gc_worker [nf_conntrack]
> <...>-57259 [005] .... 29623.811410: workqueue_execute_start: work struct 0000000000aeec55: function fb_flashcursor
> <...>-57259 [005] .... 29623.811421: workqueue_execute_end: work struct 0000000000aeec55: function fb_flashcursor
> <...>-57259 [005] .... 29623.811422: workqueue_execute_start: work struct 00000000a6d382bb: function vmstat_update
> <...>-57259 [005] .... 29623.811435: workqueue_execute_end: work struct 00000000a6d382bb: function vmstat_update
>
>> On 7 Jul 2020, at 22:44, Martin Zaharinov <[email protected]> wrote:
>>
>> the problem is hear with kernel 5.7.7
>>
>> last work kernel without this problem is 5.6.7
>>
>> hear is more info:
>>
>> cat /proc/57259/stack
>> root@megacableamarilis:~# cat /proc/57259/stack
>> [<0>] gc_worker+0x1be/0x380 [nf_conntrack]
>> [<0>] process_one_work+0x1bc/0x3b0
>> [<0>] worker_thread+0x4d/0x460
>> [<0>] kthread+0x10d/0x130
>> [<0>] ret_from_fork+0x1f/0x30
>>
>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 57259 root 28 0 0 0 0 R 69.8 0.0 82:42.14 kworker/5:2+events_power_efficient
>> 32 root 21 0 0 0 0 R 31.0 0.0 87:06.33 ksoftirqd/4
>>
>>
>> Please help to fix this problem
>>
>>> On 22 Apr 2020, at 15:55, Martin Zaharinov <[email protected]> wrote:
>>>
>>> Hello Qian and Greg
>>> With latest 5.6.x kernel have problem with events_power_efficient 28488 root 28 0 0 0 0 R 95.5 0.0 101:38.19 kworker/2:1+events_power_efficient Process start to load machine after 3-4 hour and load not stop only reboot machine remove process . Server runing on AMD EPIC CPU 2x 7301 32Gb Ram Have 2 x 10G card Intel when machine load over 1G traffic machine locked and only restart fix problem to next load . After move traffic and server stop load process still hear and load server ?
>>> And after reboot process move to other core .
>>>
>>> Best Regards,
>>> Martin
>>
>
On Wed, Jul 08, 2020 at 09:50:49AM +0300, Martin Zaharinov wrote:
> Add Greg , Florian, Eric to this bug
>
> > On 7 Jul 2020, at 22:54, Martin Zaharinov <[email protected]> wrote:
> >
> > And this is log from /sys/kernel/debug/tracing/trace
> >
> >
> > # entries-in-buffer/entries-written: 32410/32410 #P:64
> > #
> > # _-----=> irqs-off
> > # / _----=> need-resched
> > # | / _---=> hardirq/softirq
> > # || / _--=> preempt-depth
> > # ||| / delay
> > # TASK-PID CPU# |||| TIMESTAMP FUNCTION
> > # | | | |||| | |
> > <...>-57259 [005] .... 29619.680698: workqueue_execute_start: work struct 00000000ef22e4b8: function gc_worker [nf_conntrack]
> > <...>-57259 [005] .... 29623.811407: workqueue_execute_end: work struct 00000000ef22e4b8: function gc_worker [nf_conntrack]
> > <...>-57259 [005] .... 29623.811410: workqueue_execute_start: work struct 0000000000aeec55: function fb_flashcursor
> > <...>-57259 [005] .... 29623.811421: workqueue_execute_end: work struct 0000000000aeec55: function fb_flashcursor
> > <...>-57259 [005] .... 29623.811422: workqueue_execute_start: work struct 00000000a6d382bb: function vmstat_update
> > <...>-57259 [005] .... 29623.811435: workqueue_execute_end: work struct 00000000a6d382bb: function vmstat_update
> >
> >> On 7 Jul 2020, at 22:44, Martin Zaharinov <[email protected]> wrote:
> >>
> >> the problem is hear with kernel 5.7.7
> >>
> >> last work kernel without this problem is 5.6.7
> >>
> >> hear is more info:
> >>
> >> cat /proc/57259/stack
> >> root@megacableamarilis:~# cat /proc/57259/stack
> >> [<0>] gc_worker+0x1be/0x380 [nf_conntrack]
> >> [<0>] process_one_work+0x1bc/0x3b0
> >> [<0>] worker_thread+0x4d/0x460
> >> [<0>] kthread+0x10d/0x130
> >> [<0>] ret_from_fork+0x1f/0x30
> >>
> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 57259 root 28 0 0 0 0 R 69.8 0.0 82:42.14 kworker/5:2+events_power_efficient
> >> 32 root 21 0 0 0 0 R 31.0 0.0 87:06.33 ksoftirqd/4
> >>
> >>
> >> Please help to fix this problem
> >>
> >>> On 22 Apr 2020, at 15:55, Martin Zaharinov <[email protected]> wrote:
> >>>
> >>> Hello Qian and Greg
> >>> With latest 5.6.x kernel have problem with events_power_efficient 28488 root 28 0 0 0 0 R 95.5 0.0 101:38.19 kworker/2:1+events_power_efficient Process start to load machine after 3-4 hour and load not stop only reboot machine remove process . Server runing on AMD EPIC CPU 2x 7301 32Gb Ram Have 2 x 10G card Intel when machine load over 1G traffic machine locked and only restart fix problem to next load . After move traffic and server stop load process still hear and load server ?
> >>> And after reboot process move to other core .
Have you used 'git bisect' to try to find the offending commit?
Without that, it's going to be hard to help you out here.
thanks,
greg k-h
Yes i search but not find any information.
And write hear to help if any have same problem or any of you have
idea from where is comme this problem.
If need more debug i will only write how to get more information.
Martin
На ср, 8.07.2020 г. в 10:09 Greg KH <[email protected]> написа:
>
> On Wed, Jul 08, 2020 at 09:50:49AM +0300, Martin Zaharinov wrote:
> > Add Greg , Florian, Eric to this bug
> >
> > > On 7 Jul 2020, at 22:54, Martin Zaharinov <[email protected]> wrote:
> > >
> > > And this is log from /sys/kernel/debug/tracing/trace
> > >
> > >
> > > # entries-in-buffer/entries-written: 32410/32410 #P:64
> > > #
> > > # _-----=> irqs-off
> > > # / _----=> need-resched
> > > # | / _---=> hardirq/softirq
> > > # || / _--=> preempt-depth
> > > # ||| / delay
> > > # TASK-PID CPU# |||| TIMESTAMP FUNCTION
> > > # | | | |||| | |
> > > <...>-57259 [005] .... 29619.680698: workqueue_execute_start: work struct 00000000ef22e4b8: function gc_worker [nf_conntrack]
> > > <...>-57259 [005] .... 29623.811407: workqueue_execute_end: work struct 00000000ef22e4b8: function gc_worker [nf_conntrack]
> > > <...>-57259 [005] .... 29623.811410: workqueue_execute_start: work struct 0000000000aeec55: function fb_flashcursor
> > > <...>-57259 [005] .... 29623.811421: workqueue_execute_end: work struct 0000000000aeec55: function fb_flashcursor
> > > <...>-57259 [005] .... 29623.811422: workqueue_execute_start: work struct 00000000a6d382bb: function vmstat_update
> > > <...>-57259 [005] .... 29623.811435: workqueue_execute_end: work struct 00000000a6d382bb: function vmstat_update
> > >
> > >> On 7 Jul 2020, at 22:44, Martin Zaharinov <[email protected]> wrote:
> > >>
> > >> the problem is hear with kernel 5.7.7
> > >>
> > >> last work kernel without this problem is 5.6.7
> > >>
> > >> hear is more info:
> > >>
> > >> cat /proc/57259/stack
> > >> root@megacableamarilis:~# cat /proc/57259/stack
> > >> [<0>] gc_worker+0x1be/0x380 [nf_conntrack]
> > >> [<0>] process_one_work+0x1bc/0x3b0
> > >> [<0>] worker_thread+0x4d/0x460
> > >> [<0>] kthread+0x10d/0x130
> > >> [<0>] ret_from_fork+0x1f/0x30
> > >>
> > >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 57259 root 28 0 0 0 0 R 69.8 0.0 82:42.14 kworker/5:2+events_power_efficient
> > >> 32 root 21 0 0 0 0 R 31.0 0.0 87:06.33 ksoftirqd/4
> > >>
> > >>
> > >> Please help to fix this problem
> > >>
> > >>> On 22 Apr 2020, at 15:55, Martin Zaharinov <[email protected]> wrote:
> > >>>
> > >>> Hello Qian and Greg
> > >>> With latest 5.6.x kernel have problem with events_power_efficient 28488 root 28 0 0 0 0 R 95.5 0.0 101:38.19 kworker/2:1+events_power_efficient Process start to load machine after 3-4 hour and load not stop only reboot machine remove process . Server runing on AMD EPIC CPU 2x 7301 32Gb Ram Have 2 x 10G card Intel when machine load over 1G traffic machine locked and only restart fix problem to next load . After move traffic and server stop load process still hear and load server ?
> > >>> And after reboot process move to other core .
>
> Have you used 'git bisect' to try to find the offending commit?
>
> Without that, it's going to be hard to help you out here.
>
> thanks,
>
> greg k-h
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?
A: No.
Q: Should I include quotations after my reply?
http://daringfireball.net/2007/07/on_top
On Wed, Jul 08, 2020 at 11:34:52AM +0300, Martin Zaharinov wrote:
> Yes i search but not find any information.
Please do the testing yourself, using 'git bisect' to find the offending
commit.
thanks,
greg k-h
Hi
Oki i find the problem is come from nf_conntrack_core
I isolate problem in this part :
queue_delayed_work(system_power_efficient_wq, &conntrack_gc_work.dwork, HZ);
When package go to queue delayed in one moment if connection track is to big process to delayed go to lock and start high cpu load.
This is need to check and find solution…
For now I remove queue_delayed_work and wait to check machine and will write status.
Martin
> On 8 Jul 2020, at 12:28, Greg KH <[email protected]> wrote:
>
>
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing in e-mail?
> A: No.
> Q: Should I include quotations after my reply?
>
> http://daringfireball.net/2007/07/on_top
>
> On Wed, Jul 08, 2020 at 11:34:52AM +0300, Martin Zaharinov wrote:
>> Yes i search but not find any information.
>
> Please do the testing yourself, using 'git bisect' to find the offending
> commit.
>
> thanks,
>
> greg k-h