Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757556Ab3GETTF (ORCPT ); Fri, 5 Jul 2013 15:19:05 -0400 Received: from zene.cmpxchg.org ([85.214.230.12]:48627 "EHLO zene.cmpxchg.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751934Ab3GETTD (ORCPT ); Fri, 5 Jul 2013 15:19:03 -0400 Date: Fri, 5 Jul 2013 15:18:54 -0400 From: Johannes Weiner To: azurIt Cc: Michal Hocko , linux-kernel@vger.kernel.org, linux-mm@kvack.org, cgroups mailinglist , KAMEZAWA Hiroyuki Subject: Re: [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM Message-ID: <20130705191854.GR17812@cmpxchg.org> References: <20130606160446.GE24115@dhcp22.suse.cz> <20130606181633.BCC3E02E@pobox.sk> <20130607131157.GF8117@dhcp22.suse.cz> <20130617122134.2E072BA8@pobox.sk> <20130619132614.GC16457@dhcp22.suse.cz> <20130622220958.D10567A4@pobox.sk> <20130624201345.GA21822@cmpxchg.org> <20130628120613.6D6CAD21@pobox.sk> <20130705181728.GQ17812@cmpxchg.org> <20130705210246.11D2135A@pobox.sk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130705210246.11D2135A@pobox.sk> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2238 Lines: 46 On Fri, Jul 05, 2013 at 09:02:46PM +0200, azurIt wrote: > >I looked at your debug messages but could not find anything that would > >hint at a deadlock. All tasks are stuck in the refrigerator, so I > >assume you use the freezer cgroup and enabled it somehow? > > > Yes, i'm really using freezer cgroup BUT i was checking if it's not > doing problems - unfortunately, several days passed from that day > and now i don't fully remember if i was checking it for both cases > (unremoveabled cgroups and these freezed processes holding web > server port). I'm 100% sure i was checking it for unremoveable > cgroups but not so sure for the other problem (i had to act quickly > in that case). Are you sure (from stacks) that freezer cgroup was > enabled there? Yeah, all the traces without exception look like this: 1372089762/23433/stack:[] refrigerator+0x95/0x160 1372089762/23433/stack:[] get_signal_to_deliver+0x1cb/0x540 1372089762/23433/stack:[] do_signal+0x6b/0x750 1372089762/23433/stack:[] do_notify_resume+0x55/0x80 1372089762/23433/stack:[] int_signal+0x12/0x17 1372089762/23433/stack:[] 0xffffffffffffffff so the freezer was already enabled when you took the backtraces. > Btw, what about that other stacks? I mean this file: > http://watchdog.sk/lkml/memcg-bug-7.tar.gz > > It was taken while running the kernel with your patch and from > cgroup which was under unresolveable OOM (just like my very original > problem). I looked at these traces too, but none of the tasks are stuck in rmdir or the OOM path. Some /are/ in the page fault path, but they are happily doing reclaim and don't appear to be stuck. So I'm having a hard time matching this data to what you otherwise observed. However, based on what you reported the most likely explanation for the continued hangs is the unfinished OOM handling for which I sent the followup patch for arch/x86/mm/fault.c. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/