Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp2097758ybh; Fri, 13 Mar 2020 12:49:04 -0700 (PDT) X-Google-Smtp-Source: ADFU+vsRWh0QbaH2D3/pQ/Pf+lhWEWZY/jH8pjH8/BlvHolZ7Za1Ax3+xdEwesSXYSYf3D7HQ+5L X-Received: by 2002:a54:4496:: with SMTP id v22mr8438063oiv.132.1584128943909; Fri, 13 Mar 2020 12:49:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584128943; cv=none; d=google.com; s=arc-20160816; b=O2IRmAWFVaX749NB5dQmj++mIvuhxuP7MULNXNoVVv//uTOlQ6dRCo0OVo67T4fd8j IfgqjjXwS+zLyMvv8KaDIgCfJS3CypsMe6iG72xviJouMyie7whCIfxZP/oMl9SwrNAa ZL2eEnpsV7zWpqYLZzC4bDNZTjykymg62TuB2LNJ96sDf3UfldUIXneeMg6Fnma8Y9kz g2eQ1LL7w9/ynVy70C61Ie1evjH/GlvCPCePR7s58Fq95L7yQygOguirBgNvv9xVW73/ M6JwfLE2m889+8qeviM9IotczPTrNa/YVbPfXW8d5GJ71anZ3Hf0sActPabUJB++jCdd rO7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=vGiIJGqjZWg4D6SeOO6uuOF0B30lmASnx/Cdl+GSrwo=; b=xbL/kPsVJnYsW08iKR0rMF5K39LZbKR9E5l8qnL7KZTxM4c7/iouIwX0HJg9Nf5KMx 4VaY4BX64tLKb/t3jyIxNIyQV4pK5deIgJnfQl+Rb1JrY3xkPG0XrqPI8CCbkp6zu3/l bTyjA55jXjeZp1uh9PEqdaSLCn/S99FKty/MjExBlMnpRALEt6qE+WPVcLV4r4lNBSpo tP2V2bQZoBDQ4DBZ+jO/1BN85A4wgpkFC0ZG7ww4ikBRIO9fpkZiTG1ZOGxNP4KZDj4r Sn69eEJNc7CPUopyaaixkP6lyDlvfCJlzZwJOD6NrMlK4rFfEFs8xhMVtCi9yvbNLaKo lqfA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j69si5149346oib.93.2020.03.13.12.48.49; Fri, 13 Mar 2020 12:49:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726666AbgCMTqx (ORCPT + 99 others); Fri, 13 Mar 2020 15:46:53 -0400 Received: from out30-130.freemail.mail.aliyun.com ([115.124.30.130]:58973 "EHLO out30-130.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726477AbgCMTqw (ORCPT ); Fri, 13 Mar 2020 15:46:52 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07417;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=5;SR=0;TI=SMTPD_---0TsV0fqL_1584128807; Received: from US-143344MP.local(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TsV0fqL_1584128807) by smtp.aliyun-inc.com(127.0.0.1); Sat, 14 Mar 2020 03:46:49 +0800 Subject: Re: [PATCH 1/2] mm: swap: make page_evictable() inline To: Shakeel Butt Cc: Vlastimil Babka , Andrew Morton , Linux MM , LKML References: <1584124476-76534-1-git-send-email-yang.shi@linux.alibaba.com> From: Yang Shi Message-ID: <520b3295-9fb8-04a7-6215-9bfda4f1a268@linux.alibaba.com> Date: Fri, 13 Mar 2020 12:46:46 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3/13/20 12:33 PM, Shakeel Butt wrote: > On Fri, Mar 13, 2020 at 11:34 AM Yang Shi wrote: >> When backporting commit 9c4e6b1a7027 ("mm, mlock, vmscan: no more >> skipping pagevecs") to our 4.9 kernel, our test bench noticed around 10% >> down with a couple of vm-scalability's test cases (lru-file-readonce, >> lru-file-readtwice and lru-file-mmap-read). I didn't see that much down >> on my VM (32c-64g-2nodes). It might be caused by the test configuration, >> which is 32c-256g with NUMA disabled and the tests were run in root memcg, >> so the tests actually stress only one inactive and active lru. It >> sounds not very usual in mordern production environment. >> >> That commit did two major changes: >> 1. Call page_evictable() >> 2. Use smp_mb to force the PG_lru set visible >> >> It looks they contribute the most overhead. The page_evictable() is a >> function which does function prologue and epilogue, and that was used by >> page reclaim path only. However, lru add is a very hot path, so it >> sounds better to make it inline. However, it calls page_mapping() which >> is not inlined either, but the disassemble shows it doesn't do push and >> pop operations and it sounds not very straightforward to inline it. >> >> Other than this, it sounds smp_mb() is not necessary for x86 since >> SetPageLRU is atomic which enforces memory barrier already, replace it >> with smp_mb__after_atomic() in the following patch. >> >> With the two fixes applied, the tests can get back around 5% on that >> test bench and get back normal on my VM. Since the test bench >> configuration is not that usual and I also saw around 6% up on the >> latest upstream, so it sounds good enough IMHO. >> >> The below is test data (lru-file-readtwice throughput) against the v5.6-rc4: >> mainline w/ inline fix >> 150MB 154MB >> > What is the test setup for the above experiment? I would like to get a repro. Just startup a VM with two nodes, then run case-lru-file-readtwice or case-lru-file-readonce in vm-scalability in root memcg or with memcg disabled.  Then get the average throughput (dd result) from the test. Our test bench uses the script from lkp, but I just ran it manually. Single node VM should be more obvious showed in my test. > >> With this patch the throughput gets 2.67% up. The data with using >> smp_mb__after_atomic() is showed in the following patch. >> >> Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs") >> Cc: Shakeel Butt >> Cc: Vlastimil Babka >> Signed-off-by: Yang Shi >> --- >> include/linux/swap.h | 24 +++++++++++++++++++++++- >> mm/vmscan.c | 23 ----------------------- >> 2 files changed, 23 insertions(+), 24 deletions(-) >> >> diff --git a/include/linux/swap.h b/include/linux/swap.h >> index 1e99f7a..297eb66 100644 >> --- a/include/linux/swap.h >> +++ b/include/linux/swap.h >> @@ -374,7 +374,29 @@ extern unsigned long mem_cgroup_shrink_node(struct mem_cgroup *mem, >> #define node_reclaim_mode 0 >> #endif >> >> -extern int page_evictable(struct page *page); >> +/* >> + * page_evictable - test whether a page is evictable >> + * @page: the page to test >> + * >> + * Test whether page is evictable--i.e., should be placed on active/inactive >> + * lists vs unevictable list. >> + * >> + * Reasons page might not be evictable: >> + * (1) page's mapping marked unevictable >> + * (2) page is part of an mlocked VMA >> + * >> + */ >> +static inline int page_evictable(struct page *page) >> +{ >> + int ret; >> + >> + /* Prevent address_space of inode and swap cache from being freed */ >> + rcu_read_lock(); >> + ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page); >> + rcu_read_unlock(); >> + return ret; >> +} >> + >> extern void check_move_unevictable_pages(struct pagevec *pvec); >> >> extern int kswapd_run(int nid); >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index 8763705..855c395 100644 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -4277,29 +4277,6 @@ int node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned int order) >> } >> #endif >> >> -/* >> - * page_evictable - test whether a page is evictable >> - * @page: the page to test >> - * >> - * Test whether page is evictable--i.e., should be placed on active/inactive >> - * lists vs unevictable list. >> - * >> - * Reasons page might not be evictable: >> - * (1) page's mapping marked unevictable >> - * (2) page is part of an mlocked VMA >> - * >> - */ >> -int page_evictable(struct page *page) >> -{ >> - int ret; >> - >> - /* Prevent address_space of inode and swap cache from being freed */ >> - rcu_read_lock(); >> - ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page); >> - rcu_read_unlock(); >> - return ret; >> -} >> - >> /** >> * check_move_unevictable_pages - check pages for evictability and move to >> * appropriate zone lru list >> -- >> 1.8.3.1 >>