Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752307AbcJ0NsJ (ORCPT ); Thu, 27 Oct 2016 09:48:09 -0400 Received: from mx6-phx2.redhat.com ([209.132.183.39]:55091 "EHLO mx6-phx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750748AbcJ0Nrd (ORCPT ); Thu, 27 Oct 2016 09:47:33 -0400 Date: Thu, 27 Oct 2016 08:52:32 -0400 (EDT) From: CAI Qian To: tj Cc: cgroups@vger.kernel.org, Johannes Weiner , linux-kernel Message-ID: <1565579766.1600243.1477572752328.JavaMail.zimbra@redhat.com> In-Reply-To: <20161004214219.GN4205@htj.duckdns.org> References: <2131586457.763354.1475242373422.JavaMail.zimbra@redhat.com> <1415238593.811146.1475257337058.JavaMail.zimbra@redhat.com> <774397084.821469.1475260403929.JavaMail.zimbra@redhat.com> <20161003013737.GR19539@ZenIV.linux.org.uk> <1937480340.100083.1475516965286.JavaMail.zimbra@redhat.com> <1812816839.401734.1475602751170.JavaMail.zimbra@redhat.com> <20161004214219.GN4205@htj.duckdns.org> Subject: local DoS - systemd hang or timeout with cgroup traces MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [10.18.41.13] X-Mailer: Zimbra 8.0.6_GA_5922 (ZimbraWebClient - GC54 (Linux)/8.0.6_GA_5922) Thread-Topic: local DoS - systemd hang or timeout with cgroup traces Thread-Index: phrtlPvN00sLYKpxBkOMWtFhGfetYA== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3914 Lines: 77 So this can still be reproduced in 4.9-rc2 by running trinity as a non-root user within 30-minute on this machine on either ext4 or xfs. Below is the trace on ext4 and the sysrq-w report. http://people.redhat.com/qcai/tmp/dmesg-ext4-cgroup-hang CAI Qian ----- Original Message ----- > From: "tj" > Sent: Tuesday, October 4, 2016 5:42:19 PM > Subject: Re: local DoS - systemd hang or timeout (WAS: Re: [RFC][CFT] splice_read reworked) > > ... > > Not sure if related, but right after this lockdep happened and trinity > > running by a > > non-privileged user finished inside the container. The host's systemctl > > command just > > hang or timeout which renders the whole system unusable. > > > > # systemctl status docker > > Failed to get properties: Connection timed out > > > > # systemctl reboot (hang) > > > ... > > [ 5535.893675] INFO: lockdep is turned off. > > [ 5535.898085] INFO: task kworker/45:4:146035 blocked for more than 120 > > seconds. > > [ 5535.906059] Tainted: G W 4.8.0-rc8-fornext+ #1 > > [ 5535.912865] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > > this message. > > [ 5535.921613] kworker/45:4 D ffff880853e9b950 14048 146035 2 > > 0x00000080 > > [ 5535.929630] Workqueue: cgroup_destroy css_killed_work_fn > > [ 5535.935582] ffff880853e9b950 0000000000000000 0000000000000000 > > ffff88086c6da000 > > [ 5535.943882] ffff88086c9e2000 ffff880853e9c000 ffff880853e9baa0 > > ffff88086c9e2000 > > [ 5535.952205] ffff880853e9ba98 0000000000000001 ffff880853e9b968 > > ffffffff817cdaaf > > [ 5535.960522] Call Trace: > > [ 5535.963265] [] schedule+0x3f/0xa0 > > [ 5535.968817] [] schedule_timeout+0x3db/0x6f0 > > [ 5535.975346] [] ? wait_for_completion+0x45/0x130 > > [ 5535.982256] [] wait_for_completion+0xc3/0x130 > > [ 5535.988972] [] ? wake_up_q+0x80/0x80 > > [ 5535.994804] [] drop_sysctl_table+0xc4/0xe0 > > [ 5536.001227] [] drop_sysctl_table+0x77/0xe0 > > [ 5536.007648] [] unregister_sysctl_table+0x4d/0xa0 > > [ 5536.014654] [] unregister_sysctl_table+0x7f/0xa0 > > [ 5536.021657] [] > > unregister_sched_domain_sysctl+0x15/0x40 > > [ 5536.029344] [] partition_sched_domains+0x44/0x450 > > [ 5536.036447] [] ? __mutex_unlock_slowpath+0x111/0x1f0 > > [ 5536.043844] [] rebuild_sched_domains_locked+0x64/0xb0 > > [ 5536.051336] [] update_flag+0x11d/0x210 > > [ 5536.057373] [] ? mutex_lock_nested+0x2df/0x450 > > [ 5536.064186] [] ? cpuset_css_offline+0x1b/0x60 > > [ 5536.070899] [] ? trace_hardirqs_on+0xd/0x10 > > [ 5536.077420] [] ? mutex_lock_nested+0x2df/0x450 > > [ 5536.084234] [] ? css_killed_work_fn+0x25/0x220 > > [ 5536.091049] [] cpuset_css_offline+0x35/0x60 > > [ 5536.097571] [] css_killed_work_fn+0x5c/0x220 > > [ 5536.104207] [] process_one_work+0x1df/0x710 > > [ 5536.110736] [] ? process_one_work+0x160/0x710 > > [ 5536.117461] [] worker_thread+0x12b/0x4a0 > > [ 5536.123697] [] ? process_one_work+0x710/0x710 > > [ 5536.130426] [] kthread+0xfe/0x120 > > [ 5536.135991] [] ret_from_fork+0x1f/0x40 > > [ 5536.142041] [] ? kthread_create_on_node+0x230/0x230 > > This one seems to be the offender. cgroup is trying to offline a > cpuset css, which takes place under cgroup_mutex. The offlining ends > up trying to drain active usages of a sysctl table which apprently is > not happening. Did something hang or crash while trying to generate > sysctl content?