Hi, greg k-h
During our testing with 4.4.73 we got soft lockup like:
NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [systemd-udevd:856]
...
Call Trace:
[<ffffffff8109d139>] vprintk_emit+0x319/0x4a0
[<ffffffff8112182d>] printk_emit+0x33/0x3b
[<ffffffff812f9e9c>] ? simple_strtoull+0x2c/0x50
[<ffffffff8109d39a>] devkmsg_write+0xaa/0x100
[<ffffffff8109d2f0>] ? vprintk+0x30/0x30
[<ffffffff811915f2>] do_readv_writev+0x1c2/0x270
[<ffffffff8117899d>] ? kmem_cache_free+0x7d/0x1a0
[<ffffffff81191729>] vfs_writev+0x39/0x50
[<ffffffff8119240a>] SyS_writev+0x4a/0xd0
[<ffffffff8158bc97>] entry_SYSCALL_64_fastpath+0x12/0x6a
Currently in 4.4 the console_unlock() called by vprintk_emit() is with
preemption disabled, so the cond_resched is not working, and soft lockup
appear if it take too much time on writing data into every console.
We found the upstream patch:
commit 6b97a20d3a79 printk: set may_schedule for some of console_trylock() callers
which should have addressed this issue, but not included in the latest 4.4.78 stable
yet, is there any plan on backport it in future?
Regards,
Michael Wang
On Tue, Jul 25, 2017 at 10:28:16AM +0200, Michael Wang wrote:
> Hi, greg k-h
>
> During our testing with 4.4.73 we got soft lockup like:
>
> NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [systemd-udevd:856]
> ...
> Call Trace:
> [<ffffffff8109d139>] vprintk_emit+0x319/0x4a0
> [<ffffffff8112182d>] printk_emit+0x33/0x3b
> [<ffffffff812f9e9c>] ? simple_strtoull+0x2c/0x50
> [<ffffffff8109d39a>] devkmsg_write+0xaa/0x100
> [<ffffffff8109d2f0>] ? vprintk+0x30/0x30
> [<ffffffff811915f2>] do_readv_writev+0x1c2/0x270
> [<ffffffff8117899d>] ? kmem_cache_free+0x7d/0x1a0
> [<ffffffff81191729>] vfs_writev+0x39/0x50
> [<ffffffff8119240a>] SyS_writev+0x4a/0xd0
> [<ffffffff8158bc97>] entry_SYSCALL_64_fastpath+0x12/0x6a
>
> Currently in 4.4 the console_unlock() called by vprintk_emit() is with
> preemption disabled, so the cond_resched is not working, and soft lockup
> appear if it take too much time on writing data into every console.
>
> We found the upstream patch:
> commit 6b97a20d3a79 printk: set may_schedule for some of console_trylock() callers
>
> which should have addressed this issue, but not included in the latest 4.4.78 stable
> yet, is there any plan on backport it in future?
That commit was in the 4.6 kernel release, so odds are, it's not on a
list of patches to backport any time soon :)
But, you can ask for it to be included, if you cc: the stable@vger list,
like I did here, and add the authors of the patch to the cc: asking if
they have any objection for it being backported (also added here...)
Sergey, any objection for me backporting this?
thanks,
greg k-h
Hello,
On (07/25/17 06:57), Greg KH wrote:
> On Tue, Jul 25, 2017 at 10:28:16AM +0200, Michael Wang wrote:
> > Hi, greg k-h
> >
> > During our testing with 4.4.73 we got soft lockup like:
> >
> > NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [systemd-udevd:856]
> > ...
> > Call Trace:
> > [<ffffffff8109d139>] vprintk_emit+0x319/0x4a0
> > [<ffffffff8112182d>] printk_emit+0x33/0x3b
> > [<ffffffff812f9e9c>] ? simple_strtoull+0x2c/0x50
> > [<ffffffff8109d39a>] devkmsg_write+0xaa/0x100
> > [<ffffffff8109d2f0>] ? vprintk+0x30/0x30
> > [<ffffffff811915f2>] do_readv_writev+0x1c2/0x270
> > [<ffffffff8117899d>] ? kmem_cache_free+0x7d/0x1a0
> > [<ffffffff81191729>] vfs_writev+0x39/0x50
> > [<ffffffff8119240a>] SyS_writev+0x4a/0xd0
> > [<ffffffff8158bc97>] entry_SYSCALL_64_fastpath+0x12/0x6a
> >
> > Currently in 4.4 the console_unlock() called by vprintk_emit() is with
> > preemption disabled, so the cond_resched is not working, and soft lockup
> > appear if it take too much time on writing data into every console.
> >
> > We found the upstream patch:
> > commit 6b97a20d3a79 printk: set may_schedule for some of console_trylock() callers
> >
> > which should have addressed this issue, but not included in the latest 4.4.78 stable
> > yet, is there any plan on backport it in future?
>
> That commit was in the 4.6 kernel release, so odds are, it's not on a
> list of patches to backport any time soon :)
>
> But, you can ask for it to be included, if you cc: the stable@vger list,
> like I did here, and add the authors of the patch to the cc: asking if
> they have any objection for it being backported (also added here...)
>
> Sergey, any objection for me backporting this?
the uncomfortable truth is that the patch will not address all of the
scenarios that can lead to lockups. printk() and console_unlock() are
ugly beasts and we are still working on a proper solution; it's a bit
harder that I though, but we are close.
(for more details https://lwn.net/Articles/705938/)
there are still tons of context and cases when printk() _can_ end up
in console_unlock() and spend there uncontrollable amount of time
(as long as other CPUs will simultaneously append new messages to the
logbuf):
spin_lock()
printk() -> console_unlock()
spin_unlock()
spin_lock_irqsave()
printk() -> console_unlock()
spin_unlock_irqrestore()
preempt_disable()
printk() -> console_unlock()
preempt_enable()
local_irq_disable()
printk() -> console_unlock()
local_irq_restore()
rcu_read_lock()
printk() -> console_unlock()
rcu_read_unlock()
IRQ->
printk() -> console_unlock()
printk_deferred()
IRQ->
printk() -> console_unlock()
and so on. in all those cases, printing CPU does not return from
console_unlock() as long as there are pending messages left; if other
CPUs printk() a lot, then printing CPU loops in console_unlock(),
regardless the context and circumstances. which is often times
result in lockups, panics, etc.
returning back to the patch,
there are two things related to it I can quickly think of:
1) the patch needs a fix-up commit 257ab443118bffc ("printk: Correctly
handle preemption in console_unlock()")
2) it may affect users
this is a bit weird... but, in fact, console_unlock() must always
run with disabled preemption. otherwise, we schedule with the
console_sem locked, which is risky and pointless. executing
console_unlock() with preemption disabled is, obviously, dangerous,
that's why consle_unlock() should stop doing the endless loop and
instead must offload (after some time) the printing duty to another
kthread/CPU.
more closer to the point,
we've got reports that in some cases due to additional points of
scheduling in console_unlock() doing massive print outs (like showing
backtraces of all tasks, or OOM dump) now takes more time (it really
depends on the conditions).
so the patch addresses some of trivial lockup cases and in general not
that bad, but at the same time it's not a complete solution and it has
some side effects.
-ss
On Wed 2017-07-26 01:33:25, Sergey Senozhatsky wrote:
> Hello,
>
> On (07/25/17 06:57), Greg KH wrote:
> > On Tue, Jul 25, 2017 at 10:28:16AM +0200, Michael Wang wrote:
> > > Hi, greg k-h
> > >
> > > During our testing with 4.4.73 we got soft lockup like:
> > >
> > > NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [systemd-udevd:856]
> > > ...
> > > Call Trace:
> > > [<ffffffff8109d139>] vprintk_emit+0x319/0x4a0
> > > [<ffffffff8112182d>] printk_emit+0x33/0x3b
> > > [<ffffffff812f9e9c>] ? simple_strtoull+0x2c/0x50
> > > [<ffffffff8109d39a>] devkmsg_write+0xaa/0x100
> > > [<ffffffff8109d2f0>] ? vprintk+0x30/0x30
> > > [<ffffffff811915f2>] do_readv_writev+0x1c2/0x270
> > > [<ffffffff8117899d>] ? kmem_cache_free+0x7d/0x1a0
> > > [<ffffffff81191729>] vfs_writev+0x39/0x50
> > > [<ffffffff8119240a>] SyS_writev+0x4a/0xd0
> > > [<ffffffff8158bc97>] entry_SYSCALL_64_fastpath+0x12/0x6a
> > >
> > > Currently in 4.4 the console_unlock() called by vprintk_emit() is with
> > > preemption disabled, so the cond_resched is not working, and soft lockup
> > > appear if it take too much time on writing data into every console.
> > >
> > > We found the upstream patch:
> > > commit 6b97a20d3a79 printk: set may_schedule for some of console_trylock() callers
> > >
> > > which should have addressed this issue, but not included in the latest 4.4.78 stable
> > > yet, is there any plan on backport it in future?
>
> returning back to the patch,
>
> there are two things related to it I can quickly think of:
>
> 1) the patch needs a fix-up commit 257ab443118bffc ("printk: Correctly
> handle preemption in console_unlock()")
>
> 2) it may affect users
> this is a bit weird... but, in fact, console_unlock() must always
> run with disabled preemption. otherwise, we schedule with the
> console_sem locked, which is risky and pointless. executing
> console_unlock() with preemption disabled is, obviously, dangerous,
> that's why consle_unlock() should stop doing the endless loop and
> instead must offload (after some time) the printing duty to another
> kthread/CPU.
>
> more closer to the point,
> we've got reports that in some cases due to additional points of
> scheduling in console_unlock() doing massive print outs (like showing
> backtraces of all tasks, or OOM dump) now takes more time (it really
> depends on the conditions).
>
>
> so the patch addresses some of trivial lockup cases and in general not
> that bad, but at the same time it's not a complete solution and it has
> some side effects.
Also note that the patch has an effect only when the kernel is built
with PREEMPT_COUNT enabled. Otherwise, printk() is not able to detect
preemtible context and skips the extra cond_resched().
The rest was explained by Sergey. In short, the proper solution is to
allow offloading from console_unlock(). Unfortunately, it is not easy
to do it a way that would be acceptable in all situations. But I hope
that we are getting close.
Best Regards,
Petr