Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp424542ybz; Wed, 15 Apr 2020 11:19:22 -0700 (PDT) X-Google-Smtp-Source: APiQypL4vHk9YbU9PDEFZvkYdVb0MPD3T7q826O9EA9R7xNUtePk5CI7QbuO6nD71ZX282ZyAPxw X-Received: by 2002:a05:6402:b47:: with SMTP id bx7mr13483518edb.374.1586974762451; Wed, 15 Apr 2020 11:19:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1586974762; cv=none; d=google.com; s=arc-20160816; b=rUtik+993FLJVxXnU8/jjf6U/13pbCdy4Vyr+Z2jD5E7uYyvlEIdEpGTWJFaOAtFcK 1KsbNT35IkeKuWUCAF5hJsIoYJnUyL8j9wCKbd8A1KsaL+q4QUe5weGtx0PL1YFYI1mh g2qOHUcSN07gEIlPm3W364MKDKS321tIoO5G5eAXfWX7doQhXVN6d5zc+jNiAnn5Y8W+ kr87u3o7KA0uyfj5F9rGZQ0Jt/rAhIfvseasgk6aIPnPmFYXiu5btzAIIp0dfX9mnPMS vEhy21v/cmB/Ru6eTeKZK/ij0HolZHWyEuG2/+7i9oQU0ZrDM0wGn9eEcs0g9bWBn3Fp uFeA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:references:cc:to :subject; bh=MEPiBh+v1C71/gaoYD9ot5hRr4SAAbEAYU6hNp+VTj8=; b=oRWk1LdhwBQTT4Dg4UFIe9+eLwM2wL4Rpz/pyINzJ7wJonIys1N2nRbbywWw+SxdcM kyXFZfg9WSNLS6em/GfrJOmmTx0MAn/vVGB64jfNGzxLmw/4tLkFc8hbDkg6a5iFsWHn IBXXjIbJflRzteSiKIa74EJWhRG52X04OQtZKx1Q1Q/XCj8J0uPXvsajkFyo8v10Ahvh maxDjSF6i2ly9aqbKHBM3OJ8EFUumQCLuk3FPOdYgUDF5fxP1Mp8NjIJSxoAWP2jpApZ lhSyjP2jZS3zpULX19EnmZwjhf+d8fMgdPpwHRTZ7WnMj7XwJHjBhwyWJvmLCEp18JK3 0peA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q17si10717809edr.419.2020.04.15.11.18.58; Wed, 15 Apr 2020 11:19:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2436495AbgDNIUO (ORCPT + 99 others); Tue, 14 Apr 2020 04:20:14 -0400 Received: from out30-44.freemail.mail.aliyun.com ([115.124.30.44]:47169 "EHLO out30-44.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2407079AbgDNITd (ORCPT ); Tue, 14 Apr 2020 04:19:33 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01355;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=39;SR=0;TI=SMTPD_---0TvVqOrg_1586852360; Received: from IT-FVFX43SYHV2H.local(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TvVqOrg_1586852360) by smtp.aliyun-inc.com(127.0.0.1); Tue, 14 Apr 2020 16:19:21 +0800 Subject: Re: [PATCH v8 03/10] mm/lru: replace pgdat lru_lock with lruvec lock To: Johannes Weiner Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, shakeelb@google.com, Michal Hocko , Vladimir Davydov , Roman Gushchin , Chris Down , Thomas Gleixner , Vlastimil Babka , Qian Cai , Andrey Ryabinin , "Kirill A. Shutemov" , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Andrea Arcangeli , David Rientjes , "Aneesh Kumar K.V" , swkhack , "Potyra, Stefan" , Mike Rapoport , Stephen Rothwell , Colin Ian King , Jason Gunthorpe , Mauro Carvalho Chehab , Peng Fan , Nikolay Borisov , Ira Weiny , Kirill Tkhai , Yafang Shao , Wei Yang References: <1579143909-156105-1-git-send-email-alex.shi@linux.alibaba.com> <1579143909-156105-4-git-send-email-alex.shi@linux.alibaba.com> <20200116215222.GA64230@cmpxchg.org> <20200413180725.GA99267@cmpxchg.org> From: Alex Shi Message-ID: <42d5c2cb-3019-993f-eba7-33a1d69ef699@linux.alibaba.com> Date: Tue, 14 Apr 2020 16:19:01 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: <20200413180725.GA99267@cmpxchg.org> Content-Type: text/plain; charset=gbk Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ?? 2020/4/14 ????2:07, Johannes Weiner ะด??: > But isolation actually needs to lock out charging, or it would operate > on the wrong list: > > isolation: commit_charge: > if (TestClearPageLRU(page)) > page->mem_cgroup = new > // page is still physically on > // the root_mem_cgroup's LRU. We're > // updating the wrong list: > memcg = page->mem_cgroup > spin_lock(memcg->lru_lock) > del_page_from_lru_list(page, memcg) > spin_unlock(memcg->lru_lock) > > lrucare really is a mess. Even before this patch series, it makes > things tricky and subtle and error prone. > > The only reason we're doing it is for when there is swapping without > swap tracking, in which case swap reahadead needs to put pages on the > LRU but cannot charge them until we have a faulting vma later. > > But it's not clear how practical such a configuration is. Both memory > and swap are shared resources, and isolation isn't really effective > when you restrict access to memory but then let workloads swap freely. > > Plus, the overhead of tracking is tiny - 512k per G of swap (0.04%). > > Maybe we should just delete MEMCG_SWAP and unconditionally track swap > entry ownership when the memory controller is enabled. I don't see a > good reason not to, and it would simplify the entire swapin path, the > LRU locking, and the page->mem_cgroup stabilization rules. Hi Johannes, I think what you mean here is to keep swap_cgroup id even it was swaped, then we read back the page from swap disk, we don't need to charge it. So all other memcg charge are just happens on non lru list, thus we have no isolation required in above awkward scenario. That sounds a good idea. so, split_huge_page and mem_cgroup_migrate should be safe, tasks cgroup migration may needs extra from_vec->lru_lock. Is that right? That's a good idea. I'm glad to have a try... BTW, As to the memcg swapped page mixed in swap disk timely. Maybe we could try Tim Chen's swap_slot for memcg. What's your idea? Thanks Alex