Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp5622129yba; Mon, 13 May 2019 14:13:01 -0700 (PDT) X-Google-Smtp-Source: APXvYqzr+QH7Y0VW474x1bPih2Hgo+S/Q/wT3ld8mO0U67j8XHB/q9EvnBRtLYDuLSElyDZSHnpB X-Received: by 2002:aa7:8a11:: with SMTP id m17mr13789592pfa.122.1557781981125; Mon, 13 May 2019 14:13:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557781981; cv=none; d=google.com; s=arc-20160816; b=pUYbknXM02FOMMmwkqJ3tYRTPZRS7+7w9zhl8/xTDe7URyXZ44PuF498ZWsrwbGj0M L6sBljtRBLoz9MkeY/uz0MVyhIHeWunthSGmfeCESV7/nUw0J+WcogVAeNp6lXoWZ2up GcLY/mQCrCSKMYrQv+HXhzZKMHG/ocYNUcEkQ10Vh3Ub/QpuBi9vj2MjtYGgydAU2ohh Z6JfePjOIbhrTNMjk1AlOMvGNZPAzK5aoGZZ/qZBfgy2cal6JP042Y5dcUWYyh0qVZGz IOHTnykDkC+x9K+C92R4ab4yjFbZH91dWnOPXCtcbdBkBVUH/MFfypXShAtRQStMfNyk ppVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=MTiAwlin3qrBjAX7kn2wF1oZRL1K9j/Ma/pCwQlx33E=; b=NJj0sJa88mCMBBOiP+dnm8Z+kHJW2m+3yjjucqdU2lNfRkZJnI/iu+bWMGoRJQfvDw CTyt3H3MV4bz3joPshgALm7lHPNG3oHx1upSs/Ykb5zFCNtY97XYVNGZJ7dWZkuhhrOK uQ9FcSuEwLexzaL3df4ohbqc0QRYm9YjkHipnwakVyVrouTbmaku4A+WLwyUXeZp3YBR Rvr9VLXwzPo4q9JmNsRbjLPhezTZqSyEljR1YuZoQw7YK6eWTss4rNPnrYVbUWJDs4D8 bSm8FhtyH0cZtSEYNFKBGnWFga69SpWckmSQ8INkqgxNakTqHHxUkIKWb5vIxlVEYN/9 r6/A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d92si19064200pld.84.2019.05.13.14.12.45; Mon, 13 May 2019 14:13:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726465AbfEMVKc (ORCPT + 99 others); Mon, 13 May 2019 17:10:32 -0400 Received: from out4437.biz.mail.alibaba.com ([47.88.44.37]:24784 "EHLO out4437.biz.mail.alibaba.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726218AbfEMVKb (ORCPT ); Mon, 13 May 2019 17:10:31 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R921e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04391;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=11;SR=0;TI=SMTPD_---0TRdjdXM_1557781806; Received: from US-143344MP.local(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TRdjdXM_1557781806) by smtp.aliyun-inc.com(127.0.0.1); Tue, 14 May 2019 05:10:09 +0800 Subject: Re: [v2 PATCH] mm: vmscan: correct nr_reclaimed for THP To: Michal Hocko Cc: ying.huang@intel.com, hannes@cmpxchg.org, mgorman@techsingularity.net, kirill.shutemov@linux.intel.com, hughd@google.com, shakeelb@google.com, william.kucharski@oracle.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <1557505420-21809-1-git-send-email-yang.shi@linux.alibaba.com> <20190513080929.GC24036@dhcp22.suse.cz> From: Yang Shi Message-ID: Date: Mon, 13 May 2019 14:09:59 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20190513080929.GC24036@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/13/19 1:09 AM, Michal Hocko wrote: > On Sat 11-05-19 00:23:40, Yang Shi wrote: >> Since commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after >> swapped out"), THP can be swapped out in a whole. But, nr_reclaimed >> still gets inc'ed by one even though a whole THP (512 pages) gets >> swapped out. >> >> This doesn't make too much sense to memory reclaim. For example, direct >> reclaim may just need reclaim SWAP_CLUSTER_MAX pages, reclaiming one THP >> could fulfill it. But, if nr_reclaimed is not increased correctly, >> direct reclaim may just waste time to reclaim more pages, >> SWAP_CLUSTER_MAX * 512 pages in worst case. > You are technically right here. This has been a known issue for a while. > I am wondering whether somebody actually noticed some misbehavior. > Swapping out is a rare event and you have to have a considerable number > of THPs to notice. The commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after swapped out") was added in 4.14, it might be not used widely yet. And we need swap + THPs to trigger it, furthermore the misbehavior might be accumulative over time. I don't expect it will have any obvious misbehavior for a single shot. > >> This change may result in more reclaimed pages than scanned pages showed >> by /proc/vmstat since scanning one head page would reclaim 512 base pages. > This is quite nasty and confusing. I am worried that having those two > unsynced begs for subtle issues. Can we account THP as scanning 512 base > pages as well? Actually, such unsync has been there for a while. The isolate_lru_pages() returns nr_taken which has THP accounted as 512 base pages, but nr_scanned just shows 1. nr_taken is not showed via /proc/vmstat, but it is showed via trace point. I think we can just account 512 base pages for nr_scanned for isolate_lru_pages() to make the counters sane since PGSCAN_KSWAPD/DIRECT just use it. And, sc->nr_scanned should be accounted as 512 base pages too otherwise we may have nr_scanned < nr_to_reclaim all the time to result in false-negative for priority raise and something else wrong (e.g. wrong vmpressure). > >> Cc: "Huang, Ying" >> Cc: Johannes Weiner >> Cc: Michal Hocko >> Cc: Mel Gorman >> Cc: "Kirill A . Shutemov" >> Cc: Hugh Dickins >> Reviewed-by: Shakeel Butt >> Signed-off-by: Yang Shi >> --- >> v2: Added Shakeel's Reviewed-by >> Use hpage_nr_pages instead of compound_order per Huang Ying and William Kucharski >> >> mm/vmscan.c | 6 +++++- >> 1 file changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index fd9de50..4226d6b 100644 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -1446,7 +1446,11 @@ static unsigned long shrink_page_list(struct list_head *page_list, >> >> unlock_page(page); >> free_it: >> - nr_reclaimed++; >> + /* >> + * THP may get swapped out in a whole, need account >> + * all base pages. >> + */ >> + nr_reclaimed += hpage_nr_pages(page); >> >> /* >> * Is there need to periodically free_page_list? It would >> -- >> 1.8.3.1 >>