Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752785Ab0H0BUu (ORCPT ); Thu, 26 Aug 2010 21:20:50 -0400 Received: from mail-ew0-f46.google.com ([209.85.215.46]:48443 "EHLO mail-ew0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752255Ab0H0BUr convert rfc822-to-8bit (ORCPT ); Thu, 26 Aug 2010 21:20:47 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=v00Tgv6XK8knZ00x7vZnZC+mtchw/aEm1spViljnHaFxMcdq875lAIK+8KxyMnx1lM mMdIN+e8lTYRp9V309qlIgEgQXr4sUBT5jdCB7yjUNokUk1Tq4J2rvnFWPg2y1MGpDLz XPxeEoUeeOQEpTbgZfgm5vietHfH+NyaiBu94= MIME-Version: 1.0 In-Reply-To: <20100827095639.6e7297de.nishimura@mxp.nes.nec.co.jp> References: <20100827095639.6e7297de.nishimura@mxp.nes.nec.co.jp> Date: Fri, 27 Aug 2010 06:50:44 +0530 X-Google-Sender-Auth: 98zM79fGVVqq_QUqaznb0p0k3_c Message-ID: Subject: Re: cgroup: rmdir() does not complete From: Balbir Singh To: Daisuke Nishimura Cc: Mark Hills , KAMEZAWA Hiroyuki , linux-kernel@vger.kernel.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2872 Lines: 65 On Fri, Aug 27, 2010 at 6:26 AM, Daisuke Nishimura wrote: > Hi. > > On Thu, 26 Aug 2010 16:51:55 +0100 (BST) > Mark Hills wrote: > >> I am experiencing hung tasks when trying to rmdir() on a cgroup. One task >> spins, others queue up behind it with the following: >> >> ? INFO: task soaked-cgroup:27257 blocked for more than 120 seconds. >> ? "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> ? soaked-cgrou D ffff8800058157c0 ? ? 0 27257 ?29411 0x00000000 >> ? ffff88004ffffdd8 0000000000000086 ffff88004ffffda8 ffff88004ffffeb8 >> ? 0000000000000010 ffff880119813780 ffff88004ffffd48 ffff88004fffffd8 >> ? ffff88004fffffd8 000000000000f9b0 00000000000157c0 ffff880137693268 >> ? Call Trace: >> ? [] ? mntput_no_expire+0x24/0xe7 >> ? [] __mutex_lock_common+0x14d/0x1b4 >> ? [] ? path_put+0x1d/0x22 >> ? [] __mutex_lock_slowpath+0x14/0x16 >> ? [] mutex_lock+0x31/0x4b >> ? [] do_rmdir+0x74/0x102 >> ? [] sys_rmdir+0x11/0x13 >> ? [] system_call_fastpath+0x16/0x1b >> >> Kernel is from Fedora, 2.6.33.6. In all cases the cgroup contains no >> tasks. >> >> Commit ec64f5 ("fix frequent -EBUSY at rmdir") adds a busy wait loop to >> the rmdir. It looks like what I am seeing here and indicates that some >> cgroup subsystem is busy, indefinitely. >> > The commit had caused a bug about rmdir, but it was fixed by the commit 88703267. > The fix was merged in 2.6.31, so it seems that you hit a new one... > >> I have not worked out how to reproduce it quickly. My only way is to >> complete a 'dd' command in the cgroup, but then the problem is so rare it >> is slow progress. >> >> Documentation/cgroup.memory.txt describes how force_empty can be required >> in some cases. Does this mean that with the patch above, these cases will >> now spin on rmdir(), instead of returning -EBUSY? How can produce a >> reliable test case requiring memory.force_empty to be used, to test this? >> > You don't need to touch "force_empty". rmdir() does what "force_empty" does. > >> Or is it likely to be some other cause, and how best to find it? >> > What cgroup subsystem did you mount where the directory existed you tried > to rmdir() first ? > If you mounted several subsystems on the same hierarchy, can you mount them > separately to narrow down the cause ? > It would also be nice to see what your mounted cgroup (filesystem perspective) looks like and what /proc/cgroups looks like when the problem occurs. Balbir -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/