Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753599AbbHaPsA (ORCPT ); Mon, 31 Aug 2015 11:48:00 -0400 Received: from mail-qk0-f170.google.com ([209.85.220.170]:35731 "EHLO mail-qk0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753104AbbHaPr7 (ORCPT ); Mon, 31 Aug 2015 11:47:59 -0400 Date: Mon, 31 Aug 2015 11:47:56 -0400 From: Tejun Heo To: Vladimir Davydov Cc: Michal Hocko , Andrew Morton , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled Message-ID: <20150831154756.GE2271@mtj.duckdns.org> References: <20150831132414.GG29723@dhcp22.suse.cz> <20150831134335.GB2271@mtj.duckdns.org> <20150831143007.GA13814@esperanza> <20150831143939.GC2271@mtj.duckdns.org> <20150831151814.GC13814@esperanza> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150831151814.GC13814@esperanza> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1923 Lines: 39 Hello, On Mon, Aug 31, 2015 at 06:18:14PM +0300, Vladimir Davydov wrote: > We have to be cautious about placing memcg_charge in slab/slub. To > understand why, consider SLAB case, which first tries to allocate from > all nodes in the order of preference w/o __GFP_WAIT and only if it fails > falls back on an allocation from any node w/ __GFP_WAIT. This is its > internal algorithm. If we blindly put memcg_charge to alloc_slab method, > then, when we are near the memcg limit, we will go over all NUMA nodes > in vain, then finally fall back to __GFP_WAIT allocation, which will get > a slab from a random node. Not only we do more work than necessary due > to walking over all NUMA nodes for nothing, but we also break SLAB > internal logic! And you just can't fix it in memcg, because memcg knows > nothing about the internal logic of SLAB, how it handles NUMA nodes. > > SLUB has a different problem. It tries to avoid high-order allocations > if there is a risk of invoking costly memory compactor. It has nothing > to do with memcg, because memcg does not care if the charge is for a > high order page or not. Maybe I'm missing something but aren't both issues caused by memcg failing to provide headroom for NOWAIT allocations when the consumption gets close to the max limit? Regardless of the specific usage, !__GFP_WAIT means "give me memory if it can be spared w/o inducing direct time-consuming maintenance work" and the contract around it is that such requests will mostly succeed under nominal conditions. Also, slab/slub might not stay as the only user of try_charge(). I still think solving this from memcg side is the right direction. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/