Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp3755342pxk; Tue, 22 Sep 2020 01:28:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxnw3jgp/5sT76c2A9ohBCons2OT4GSxfRcc7KiJwKYdf/iAoINctx85x/ONy2d1gTCGPcw X-Received: by 2002:a50:fe82:: with SMTP id d2mr2730463edt.86.1600763306573; Tue, 22 Sep 2020 01:28:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600763306; cv=none; d=google.com; s=arc-20160816; b=Pb2HG9LlAYahhS4YFkCfDiXILep7BsjeKpQVqpJpFFmorOasWYAOdORY3X023jT0MW RfpGaAzxMFF0OwmTaSBjZBZ0dz7iEZ48wkYOKO5E7t91erOh78BK75jg7EOjHY1Pm7+g ljxVWeJAhrGzMNCO+NlTcdO4X95y+imljG1QW5fRkwVupOlbvSYPOdQ0P83QdtYKqKRg YjLFVmRnzjdxxAYImgZJo90tcyU4nVhC6F0Uq92zb9agTh/PHI8bYdHnVuCcuuc37jwj RVHVk83vsfcQBQ1ySO+7sCXO/p+P6/A3DuMQeWiOPNbH4Na3F/3dTdPZh+l85Qn/emRo yy9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:message-id :in-reply-to:subject:cc:to:from:date:dkim-signature; bh=6LZJbo2wShWpm8ovGLzdTSaw70v+wjIXX2UN2lui5KE=; b=EwPElZK+yjPhbxlmsnmDPErQUIEOm/xZu/fk958cygF+o0dnn3EOc9EpTUOZ/1fcGw 9ZkI2mPJEBkZEsN2oA3QtIxT6Qlfw7RG/nfTjmEY1zgEy+OWa6u+etpoFOFRTaCixz9r b3/TR4zx4uQz7OER/gUQ42akUTE5CiKpE7GE9ys0rDXbvlulc3gDu7/XfaLdGJXD0uCH 9oK6miJD8tVJJDg4dQc/IkExrXecHlscUUeu1q6S3k+8yhVhYtGPu6b7zkZWks9P5Urj RboJaIB43RX7uEz3qU//+cfydqwrw/0FEFgT9MSf1B+mYtTwqMRtMzyBwmIew4YI0kjd SEbQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Tdwv7P0g; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w2si9901329edt.253.2020.09.22.01.28.03; Tue, 22 Sep 2020 01:28:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Tdwv7P0g; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728215AbgIVFkQ (ORCPT + 99 others); Tue, 22 Sep 2020 01:40:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40532 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728222AbgIVFkQ (ORCPT ); Tue, 22 Sep 2020 01:40:16 -0400 Received: from mail-ot1-x343.google.com (mail-ot1-x343.google.com [IPv6:2607:f8b0:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D76F8C0613CF for ; Mon, 21 Sep 2020 22:40:15 -0700 (PDT) Received: by mail-ot1-x343.google.com with SMTP id u25so14569398otq.6 for ; Mon, 21 Sep 2020 22:40:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=6LZJbo2wShWpm8ovGLzdTSaw70v+wjIXX2UN2lui5KE=; b=Tdwv7P0gQGa7UVTHfNJwLlNsLU/GP+aKq88wZ8ejQgYw/uyivNES8YKU7x42iIbiRb 0BfBgo4om6rzyK0w1tBnPsR/RJ5sn4At3bjR/3SJUzewnTwT+C8Rpf6ikrzPDJrlpwA/ FLpYO9QCCWEFNBFK8FJiKS80H0ONIqZKl2gMxw01WHWdmqNWd8Xeae2kbjJNdWqL+MOv OkMQRMA2/RX+s0m3blLkUBBDAXlMUoU2F/o09GsMfybWflDu3PPxjFggZQ7NrUzdgYgo q2XqJWH0IGp6jwYoXottUuBFTgKBAHy6p1booeRcvVxxJLZCLUzsZuYez0C5q/s6TMyf ju6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=6LZJbo2wShWpm8ovGLzdTSaw70v+wjIXX2UN2lui5KE=; b=NOnnIC7btlCZS12yOE13xwFB0Ju4jSQcxvr8IpwasvxDVqF5N/x2NhD7+/dWz+bKK4 M8rTk6bv80JBkktbyKUEwE/J0TeOKbAmKEWQGZUxJTaGFg9vd5PB6khLuGX67e4OGirf Gb8CnYmb7rjJDHP6o/uAR1RjWf9cqdRRrd8uvI5g1O3IV+350ZCslAolF9/HJyGk6fog l9ISB8kqymdONEILKzUlV2MNEcZeTOSTyT0g8cyoyNbxvzEEXpdxKzlA0uC88EzlU0pP +do77WBPiO89372cha/hh+amftdScELegwefzjOMub1ZFIR6fU8UajenOvNmtog/iBnW CTxg== X-Gm-Message-State: AOAM532NSW1BsmcXsKDU4bVaPIf8sjM68C89MFArkkdpjjftGrRCtY9z 3NcEEDPBzJem1YRp3fii6T568g== X-Received: by 2002:a9d:7448:: with SMTP id p8mr1777819otk.306.1600753214934; Mon, 21 Sep 2020 22:40:14 -0700 (PDT) Received: from eggly.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id p20sm8501616ook.27.2020.09.21.22.40.11 (version=TLS1 cipher=ECDHE-ECDSA-AES128-SHA bits=128/128); Mon, 21 Sep 2020 22:40:13 -0700 (PDT) Date: Mon, 21 Sep 2020 22:40:10 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@eggly.anvils To: Alex Shi cc: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com, kirill@shutemov.name, alexander.duyck@gmail.com, rong.a.chen@intel.com, mhocko@suse.com, vdavydov.dev@gmail.com, shy828301@gmail.com, Alexander Duyck , Thomas Gleixner , Andrey Ryabinin Subject: Re: [PATCH v18 21/32] mm/lru: introduce the relock_page_lruvec function In-Reply-To: <1598273705-69124-22-git-send-email-alex.shi@linux.alibaba.com> Message-ID: References: <1598273705-69124-1-git-send-email-alex.shi@linux.alibaba.com> <1598273705-69124-22-git-send-email-alex.shi@linux.alibaba.com> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 24 Aug 2020, Alex Shi wrote: > From: Alexander Duyck > > Use this new function to replace repeated same code, no func change. > > When testing for relock we can avoid the need for RCU locking if we simply > compare the page pgdat and memcg pointers versus those that the lruvec is > holding. By doing this we can avoid the extra pointer walks and accesses of > the memory cgroup. > > In addition we can avoid the checks entirely if lruvec is currently NULL. > > Signed-off-by: Alexander Duyck > Signed-off-by: Alex Shi Again, I'll wait to see __munlock_pagevec() fixed before Acking this one, but that's the only issue. And I've suggested that you use lruvec_holds_page_lru_lock() in mm/vmscan.c move_pages_to_lru(), to replace the uglier and less efficient VM_BUG_ON_PAGE there. Oh, there is one other issue: 0day robot did report (2020-06-19) that sparse doesn't understand relock_page_lruvec*(): I've never got around to working out how to write what it needs, conditional __release plus __acquire in some form, I imagine. I've never got into sparse annotations before, I'll give it a try, but if anyone beats me that will be welcome: and there are higher priorities - I do not think you should wait for the sparse warning to be fixed before reposting. > Cc: Johannes Weiner > Cc: Andrew Morton > Cc: Thomas Gleixner > Cc: Andrey Ryabinin > Cc: Matthew Wilcox > Cc: Mel Gorman > Cc: Konstantin Khlebnikov > Cc: Hugh Dickins > Cc: Tejun Heo > Cc: linux-kernel@vger.kernel.org > Cc: cgroups@vger.kernel.org > Cc: linux-mm@kvack.org > --- > include/linux/memcontrol.h | 52 ++++++++++++++++++++++++++++++++++++++++++++++ > mm/mlock.c | 9 +------- > mm/swap.c | 33 +++++++---------------------- > mm/vmscan.c | 8 +------ > 4 files changed, 61 insertions(+), 41 deletions(-) > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index 7b170e9028b5..ee6ef2d8ad52 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -488,6 +488,22 @@ static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg, > > struct lruvec *mem_cgroup_page_lruvec(struct page *, struct pglist_data *); > > +static inline bool lruvec_holds_page_lru_lock(struct page *page, > + struct lruvec *lruvec) > +{ > + pg_data_t *pgdat = page_pgdat(page); > + const struct mem_cgroup *memcg; > + struct mem_cgroup_per_node *mz; > + > + if (mem_cgroup_disabled()) > + return lruvec == &pgdat->__lruvec; > + > + mz = container_of(lruvec, struct mem_cgroup_per_node, lruvec); > + memcg = page->mem_cgroup ? : root_mem_cgroup; > + > + return lruvec->pgdat == pgdat && mz->memcg == memcg; > +} > + > struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p); > > struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm); > @@ -1023,6 +1039,14 @@ static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page, > return &pgdat->__lruvec; > } > > +static inline bool lruvec_holds_page_lru_lock(struct page *page, > + struct lruvec *lruvec) > +{ > + pg_data_t *pgdat = page_pgdat(page); > + > + return lruvec == &pgdat->__lruvec; > +} > + > static inline struct mem_cgroup *parent_mem_cgroup(struct mem_cgroup *memcg) > { > return NULL; > @@ -1469,6 +1493,34 @@ static inline void unlock_page_lruvec_irqrestore(struct lruvec *lruvec, > spin_unlock_irqrestore(&lruvec->lru_lock, flags); > } > > +/* Don't lock again iff page's lruvec locked */ > +static inline struct lruvec *relock_page_lruvec_irq(struct page *page, > + struct lruvec *locked_lruvec) > +{ > + if (locked_lruvec) { > + if (lruvec_holds_page_lru_lock(page, locked_lruvec)) > + return locked_lruvec; > + > + unlock_page_lruvec_irq(locked_lruvec); > + } > + > + return lock_page_lruvec_irq(page); > +} > + > +/* Don't lock again iff page's lruvec locked */ > +static inline struct lruvec *relock_page_lruvec_irqsave(struct page *page, > + struct lruvec *locked_lruvec, unsigned long *flags) > +{ > + if (locked_lruvec) { > + if (lruvec_holds_page_lru_lock(page, locked_lruvec)) > + return locked_lruvec; > + > + unlock_page_lruvec_irqrestore(locked_lruvec, *flags); > + } > + > + return lock_page_lruvec_irqsave(page, flags); > +} > + > #ifdef CONFIG_CGROUP_WRITEBACK > > struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb); > diff --git a/mm/mlock.c b/mm/mlock.c > index 177d2588e863..0448409184e3 100644 > --- a/mm/mlock.c > +++ b/mm/mlock.c > @@ -302,17 +302,10 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) > /* Phase 1: page isolation */ > for (i = 0; i < nr; i++) { > struct page *page = pvec->pages[i]; > - struct lruvec *new_lruvec; > > /* block memcg change in mem_cgroup_move_account */ > lock_page_memcg(page); > - new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); > - if (new_lruvec != lruvec) { > - if (lruvec) > - unlock_page_lruvec_irq(lruvec); > - lruvec = lock_page_lruvec_irq(page); > - } > - > + lruvec = relock_page_lruvec_irq(page, lruvec); > if (TestClearPageMlocked(page)) { > /* > * We already have pin from follow_page_mask() > diff --git a/mm/swap.c b/mm/swap.c > index b67959b701c0..2ac78e8fab71 100644 > --- a/mm/swap.c > +++ b/mm/swap.c > @@ -209,19 +209,12 @@ static void pagevec_lru_move_fn(struct pagevec *pvec, > > for (i = 0; i < pagevec_count(pvec); i++) { > struct page *page = pvec->pages[i]; > - struct lruvec *new_lruvec; > > /* block memcg migration during page moving between lru */ > if (!TestClearPageLRU(page)) > continue; > > - new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); > - if (lruvec != new_lruvec) { > - if (lruvec) > - unlock_page_lruvec_irqrestore(lruvec, flags); > - lruvec = lock_page_lruvec_irqsave(page, &flags); > - } > - > + lruvec = relock_page_lruvec_irqsave(page, lruvec, &flags); > (*move_fn)(page, lruvec); > > SetPageLRU(page); > @@ -865,17 +858,12 @@ void release_pages(struct page **pages, int nr) > } > > if (PageLRU(page)) { > - struct lruvec *new_lruvec; > - > - new_lruvec = mem_cgroup_page_lruvec(page, > - page_pgdat(page)); > - if (new_lruvec != lruvec) { > - if (lruvec) > - unlock_page_lruvec_irqrestore(lruvec, > - flags); > + struct lruvec *prev_lruvec = lruvec; > + > + lruvec = relock_page_lruvec_irqsave(page, lruvec, > + &flags); > + if (prev_lruvec != lruvec) > lock_batch = 0; > - lruvec = lock_page_lruvec_irqsave(page, &flags); > - } > > VM_BUG_ON_PAGE(!PageLRU(page), page); > __ClearPageLRU(page); > @@ -982,15 +970,8 @@ void __pagevec_lru_add(struct pagevec *pvec) > > for (i = 0; i < pagevec_count(pvec); i++) { > struct page *page = pvec->pages[i]; > - struct lruvec *new_lruvec; > - > - new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); > - if (lruvec != new_lruvec) { > - if (lruvec) > - unlock_page_lruvec_irqrestore(lruvec, flags); > - lruvec = lock_page_lruvec_irqsave(page, &flags); > - } > > + lruvec = relock_page_lruvec_irqsave(page, lruvec, &flags); > __pagevec_lru_add_fn(page, lruvec); > } > if (lruvec) > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 789444ae4c88..2c94790d4cb1 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -4280,15 +4280,9 @@ void check_move_unevictable_pages(struct pagevec *pvec) > > for (i = 0; i < pvec->nr; i++) { > struct page *page = pvec->pages[i]; > - struct lruvec *new_lruvec; > > pgscanned++; > - new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); > - if (lruvec != new_lruvec) { > - if (lruvec) > - unlock_page_lruvec_irq(lruvec); > - lruvec = lock_page_lruvec_irq(page); > - } > + lruvec = relock_page_lruvec_irq(page, lruvec); > > if (!PageLRU(page) || !PageUnevictable(page)) > continue; > -- > 1.8.3.1