Received: by 10.192.165.156 with SMTP id m28csp652742imm; Fri, 13 Apr 2018 05:43:31 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+J9g6md/zsVKVBlHEItoWKyfnJBC1rardO0kserX7fKggyKW7X+g3g9K7z/LIO6kBK4yUq X-Received: by 10.98.197.196 with SMTP id j187mr3567887pfg.244.1523623411105; Fri, 13 Apr 2018 05:43:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523623411; cv=none; d=google.com; s=arc-20160816; b=dh7DCQtYhYHwNmIKZHXS8NdI8WY5aTHyhi++P8jzz7E9pY0mkgi3cgNMPCvV+PiQo2 I1/YFa3gxslfSxVKnAqeozVLvl5W8uHc2+R0wLHyFePdlQJoBfFw5xTtAQZ3Ax66W5LZ WLaeDg3J3ZtAeqIptlziGmtjQIhNEXp971bAgTVgJHBRiv0Zq6RUhYZEBLF4zEYO8tbt oSvgSHa9RuQ/l6K34csP4jLpS5iDk8JIMDlr0yz9s8Vz1Cc2PjYMecvxwMfn0t3nabpg twOu5ejVQTp3RQ+xh5VjJhz4vGw/ibWrhslnZOCm+9FodxDb6bAilgBoIC8RUd7KtSQ8 d7Ig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=N3Sa862mNAxMMnZ49wl57+75reBoRFNPkp5GrXYdK08=; b=Uu3P4M0tPJKLpXjvWVJTdFMUcdiH7FsfVH/7SfeuKsDpowHz6+OKBwki43zhXDhnao OGdxFtyygM3Ptcs+7H9fnngGpTKC+6euoNpJKK/srtzDDFNDaWUy93bDTSl8bjZsNfiz xq1PXWoMop5Qy4sp1E5uJVqVTbZW9FUtAvcStdW0hpHvkWNgG2iF3lpj09Et+6+bwyO5 bjios8TkIhH2i/+/RVYz04+ZeWyOXp19cR1kvjaP6TxhJgH2XYJUqAwwCKnmzOYybAN3 I3t+ZAsaSaZEwB6XmVTR8gOYZlQSPpDiQNFTdngjClKGNICwp60DjPpstrwO7dDP7+Y+ kGcw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y101-v6si5460240plh.188.2018.04.13.05.43.17; Fri, 13 Apr 2018 05:43:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754776AbeDMLjB (ORCPT + 99 others); Fri, 13 Apr 2018 07:39:01 -0400 Received: from mx2.suse.de ([195.135.220.15]:40162 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754752AbeDMLi5 (ORCPT ); Fri, 13 Apr 2018 07:38:57 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 3CFFEAB37; Fri, 13 Apr 2018 11:38:56 +0000 (UTC) Date: Fri, 13 Apr 2018 13:38:55 +0200 From: Michal Hocko To: Kirill Tkhai Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, vdavydov.dev@gmail.com, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] memcg: Remove memcg_cgroup::id from IDR on mem_cgroup_css_alloc() failure Message-ID: <20180413113855.GI17484@dhcp22.suse.cz> References: <152354470916.22460.14397070748001974638.stgit@localhost.localdomain> <20180413085553.GF17484@dhcp22.suse.cz> <20180413110200.GG17484@dhcp22.suse.cz> <06931a83-91d2-3dcf-31cf-0b98d82e957f@virtuozzo.com> <20180413112036.GH17484@dhcp22.suse.cz> <6dbc33bb-f3d5-1a46-b454-13c6f5865fcd@virtuozzo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6dbc33bb-f3d5-1a46-b454-13c6f5865fcd@virtuozzo.com> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri 13-04-18 14:29:11, Kirill Tkhai wrote: > On 13.04.2018 14:20, Michal Hocko wrote: > > On Fri 13-04-18 14:06:40, Kirill Tkhai wrote: > >> On 13.04.2018 14:02, Michal Hocko wrote: > >>> On Fri 13-04-18 12:35:22, Kirill Tkhai wrote: > >>>> On 13.04.2018 11:55, Michal Hocko wrote: > >>>>> On Thu 12-04-18 17:52:04, Kirill Tkhai wrote: > >>>>> [...] > >>>>>> @@ -4471,6 +4477,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) > >>>>>> > >>>>>> return &memcg->css; > >>>>>> fail: > >>>>>> + mem_cgroup_id_remove(memcg); > >>>>>> mem_cgroup_free(memcg); > >>>>>> return ERR_PTR(-ENOMEM); > >>>>>> } > >>>>> > >>>>> The only path which jumps to fail: here (in the current mmotm tree) is > >>>>> error = memcg_online_kmem(memcg); > >>>>> if (error) > >>>>> goto fail; > >>>>> > >>>>> AFAICS and the only failure path in memcg_online_kmem > >>>>> memcg_id = memcg_alloc_cache_id(); > >>>>> if (memcg_id < 0) > >>>>> return memcg_id; > >>>>> > >>>>> I am not entirely clear on memcg_alloc_cache_id but it seems we do clean > >>>>> up properly. Or am I missing something? > >>>> > >>>> memcg_alloc_cache_id() may allocate a lot of memory, in case of the system reached > >>>> memcg_nr_cache_ids cgroups. In this case it iterates over all LRU lists, and double > >>>> size of every of them. In case of memory pressure it can fail. If this occurs, > >>>> mem_cgroup::id is not unhashed from IDR and we leak this id. > >>> > >>> OK, my bad I was looking at the bad code path. So you want to clean up > >>> after mem_cgroup_alloc not memcg_online_kmem. Now it makes much more > >>> sense. Sorry for the confusion on my end. > >>> > >>> Anyway, shouldn't we do the thing in mem_cgroup_free() to be symmetric > >>> to mem_cgroup_alloc? > >> > >> We can't, since it's called from mem_cgroup_css_free(), which doesn't have a deal > >> with idr freeing. All the asymmetry, we see, is because of the trick to unhash ID > >> earlier, then from mem_cgroup_css_free(). > > > > Are you sure. It's been some time since I've looked at the quite complex > > cgroup tear down code but from what I remember, css_free is called on > > the css release (aka when the reference count drops to zero). mem_cgroup_id_put_many > > seems to unpin the css reference so we should have idr_remove by the > > time when css_free is called. Or am I still wrong and should go over the > > brain hurting cgroup removal code again? > > mem_cgroup_id_put_many() unpins css, but this may be not the last reference to the css. > Thus, we release ID earlier, then all references to css are freed. Right and so what. If we have released the idr then we are not going to do that again in css_free. That is why we have that memcg->id.id > 0 check before idr_remove and memcg->id.id = 0 for the last memcg ref. count. So again, why cannot we do the clean up in mem_cgroup_free and have a less confusing code? Or am I just not getting your point and being dense here? -- Michal Hocko SUSE Labs