Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752871Ab3FFQEw (ORCPT ); Thu, 6 Jun 2013 12:04:52 -0400 Received: from cantor2.suse.de ([195.135.220.15]:41703 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751728Ab3FFQEu (ORCPT ); Thu, 6 Jun 2013 12:04:50 -0400 Date: Thu, 6 Jun 2013 18:04:46 +0200 From: Michal Hocko To: azurIt Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, cgroups mailinglist , KAMEZAWA Hiroyuki , Johannes Weiner Subject: Re: [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set Message-ID: <20130606160446.GE24115@dhcp22.suse.cz> References: <20130208123854.GB7557@dhcp22.suse.cz> <20130208145616.FB78CE24@pobox.sk> <20130208152402.GD7557@dhcp22.suse.cz> <20130208165805.8908B143@pobox.sk> <20130208171012.GH7557@dhcp22.suse.cz> <20130208220243.EDEE0825@pobox.sk> <20130210150310.GA9504@dhcp22.suse.cz> <20130210174619.24F20488@pobox.sk> <20130211112240.GC19922@dhcp22.suse.cz> <20130222092332.4001E4B6@pobox.sk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130222092332.4001E4B6@pobox.sk> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2589 Lines: 57 Hi, I am really sorry it took so long but I was constantly preempted by other stuff. I hope I have a good news for you, though. Johannes has found a nice way how to overcome deadlock issues from memcg OOM which might help you. Would you be willing to test with his patch (http://permalink.gmane.org/gmane.linux.kernel.mm/101437). Unlike my patch which handles just the i_mutex case his patch solved all possible locks. I can backport the patch for your kernel (are you still using 3.2 kernel or you have moved to a newer one?). On Fri 22-02-13 09:23:32, azurIt wrote: > >Unfortunately I am not able to reproduce this behavior even if I try > >to hammer OOM like mad so I am afraid I cannot help you much without > >further debugging patches. > >I do realize that experimenting in your environment is a problem but I > >do not many options left. Please do not use strace and rather collect > >/proc/pid/stack instead. It would be also helpful to get group/tasks > >file to have a full list of tasks in the group > > > > Hi Michal, > > > sorry that i didn't response for a while. Today i installed kernel with your two patches and i'm running it now. I'm still having problems with OOM which is not able to handle low memory and is not killing processes. Here is some info: > > - data from cgroup 1258 while it was under OOM and no processes were killed (so OOM don't stop and cgroup was freezed) > http://watchdog.sk/lkml/memcg-bug-6.tar.gz > > I noticed problem about on 8:39 and waited until 8:57 (nothing happend). Then i killed process 19864 which seems to help and other processes probably ends and cgroup started to work. But problem accoured again about 20 seconds later, so i killed all processes at 8:58. The problem is occuring all the time since then. All processes (in that cgroup) are always in state 'D' when it occurs. > > > - kernel log from boot until now > http://watchdog.sk/lkml/kern3.gz > > > Btw, something probably happened also at about 3:09 but i wasn't able to gather any data because my 'load check script' killed all apache processes (load was more than 100). > > > > azur > -- > To unsubscribe from this list: send the line "unsubscribe cgroups" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/