Received: by 10.192.165.148 with SMTP id m20csp103194imm; Fri, 4 May 2018 07:21:27 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpNSGdK2h2ovKXCBNfmaTCi/fZImroKEdy70ATYk1aiej6y/QY2Wm9YwStB3yrhAkJwmAF2 X-Received: by 10.98.21.73 with SMTP id 70mr27257567pfv.91.1525443687279; Fri, 04 May 2018 07:21:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525443687; cv=none; d=google.com; s=arc-20160816; b=Q0/6XxC7t7Xl+RdIFwGdLPVjvQ9lRZ68j9nS9dmybTkrKI+WlSVlOHX84SeX3e0maE mVhZFrx1fyPCMA7B4yX42TfEm4Fhb/WHVDh1H37PtItwXtFbbTltwlvfN7FNS6Nt7C1l 3QLxSU8DwMMVJfPO2HbouYUEPMV4OjtQDPzO/ENvHGlt68kHENT0i31hgmtzjk0yVk8j nOHBG7K9UH+iG8fGx5uVcHvKpCcEi95OJEhpLicw1Oz3+yAKcL8b8/QxjiS47qaZ7ZUr 5rXsiP3WXLIFTYStNF2RVaPi+NkwY8xexS0IU84xKmmBJoXZy0Dmbx8r0XX0MtfqtBEL Hqcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=esPXuna2QleSo9g6Xfi8Kp+9ryUhLk8dzw73yBACTHU=; b=GDhrPwItcb6UjmPhRkz6zq+cBHKEFLfapJqlAzouq7lGGXr2a9e/r2xLDC3PEf2Rgo ImctEjX1YiTm09TyOKeVBRUZ4NVl+mUlbrR2D1Hd2Rfetg0FpuBf5j5oHIetI3bTnsU+ 3gSScjsI7E/tIDlftl6UPfT6ouTgHW9cqpeG8UCVaIViAaC/MBk7Uv0UrcrQ/QY+7Y7U Znka0Y4YAH/1rQqukZ4TrJzMTIjKCTL+3KupnkihDe9EC81oExvfUJ3aToLKpJAPKSsA F4qV6kSVPQai10GBZ/VTtGLcKRPMJZPERi9EwxDP8BaaV1/3OQhMTyHLKXRF+AN++zJQ n/hw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x6si15777754pfx.216.2018.05.04.07.21.13; Fri, 04 May 2018 07:21:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751447AbeEDOVD (ORCPT + 99 others); Fri, 4 May 2018 10:21:03 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:48296 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751262AbeEDOVC (ORCPT ); Fri, 4 May 2018 10:21:02 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7E26F8163AD4; Fri, 4 May 2018 14:21:01 +0000 (UTC) Received: from dhcp-27-174.brq.redhat.com (unknown [10.34.27.30]) by smtp.corp.redhat.com (Postfix) with SMTP id CC457215CDA7; Fri, 4 May 2018 14:20:57 +0000 (UTC) Received: by dhcp-27-174.brq.redhat.com (nbSMTP-1.00) for uid 1000 oleg@redhat.com; Fri, 4 May 2018 16:21:01 +0200 (CEST) Date: Fri, 4 May 2018 16:20:57 +0200 From: Oleg Nesterov To: "Eric W. Biederman" Cc: Johannes Weiner , Michal Hocko , Kirill Tkhai , akpm@linux-foundation.org, peterz@infradead.org, viro@zeniv.linux.org.uk, mingo@kernel.org, paulmck@linux.vnet.ibm.com, keescook@chromium.org, riel@redhat.com, tglx@linutronix.de, kirill.shutemov@linux.intel.com, marcos.souza.org@gmail.com, hoeun.ryu@gmail.com, pasha.tatashin@oracle.com, gs051095@gmail.com, dhowells@redhat.com, rppt@linux.vnet.ibm.com, linux-kernel@vger.kernel.org, Balbir Singh , Tejun Heo Subject: Re: [PATCH] memcg: Replace mm->owner with mm->memcg Message-ID: <20180504142056.GA26151@redhat.com> References: <20180426192818.GX17484@dhcp22.suse.cz> <20180427070848.GA17484@dhcp22.suse.cz> <87r2n01q58.fsf@xmission.com> <87o9hz2sw3.fsf@xmission.com> <87h8nr2sa3.fsf_-_@xmission.com> <20180502084708.GC26305@dhcp22.suse.cz> <20180502132026.GB16060@cmpxchg.org> <87lgd1zww0.fsf_-_@xmission.com> <20180503133338.GA23401@redhat.com> <87y3h0x0qg.fsf@xmission.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87y3h0x0qg.fsf@xmission.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Fri, 04 May 2018 14:21:01 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Fri, 04 May 2018 14:21:01 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'oleg@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/03, Eric W. Biederman wrote: > > Oleg Nesterov writes: > > > On 05/02, Eric W. Biederman wrote: > >> > >> +static void mem_cgroup_fork(struct task_struct *tsk) > >> +{ > >> + struct cgroup_subsys_state *css; > >> + > >> + rcu_read_lock(); > >> + css = task_css(tsk, memory_cgrp_id); > >> + if (css && css_tryget(css)) > >> + task_update_memcg(tsk, mem_cgroup_from_css(css)); > >> + rcu_read_unlock(); > >> +} > > > > Why do we need it? > > > > The child's mm->memcg was already initialized by mm_init_memcg() and we can't > > race with migrate until cgroup_threadgroup_change_end() ? > > I admit I missed the cgroup_threadgroup_change_begin > cgroup_threadgroup_change_end pair in fs fork. In this case it doesn't > matter because mm_init_memcg is called from: > > copy_mm > dup_mm > mm_init > > And copy_mm is called before we call cgroup_threadgroup_change_begin. > So the race remains. Ah yes, you are right. > We could move move cgroup_threadgroup_change_begin earlier, to remove > the need for mem_cgroup_fork. But I have not analyzed that. No, cgroup_threadgroup_change_begin() was called early and this was wrong, see 568ac888215c7fb2fabe8ea739b00ec3c1f5d440. Actually there were more problems, say copy_net() could deadlock because cleanup_net() does do_wait() with net_mutex held. OK, what about exec() ? mm_init_memcg() initializes bprm->mm->memcg early in bprm_mm_init(). What if the execing task migrates before exec_mmap() ? Oleg.