Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp517225ybf; Wed, 26 Feb 2020 17:38:11 -0800 (PST) X-Google-Smtp-Source: APXvYqyzfmxKagD/FTXt7zS5DFgDkYaczFiIQVtilFAeFVuV8+qDfnz07BpGWH0zjoCqPbhKkCKE X-Received: by 2002:a9d:624e:: with SMTP id i14mr1380374otk.371.1582767491494; Wed, 26 Feb 2020 17:38:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582767491; cv=none; d=google.com; s=arc-20160816; b=d/FGQ9CUVFHqWZiAdVfWgg6izkmAuJCeuJuXUcOAipk+gFgbgxVdeJe9zSms+Y2KyP 3VW97YVHit5LTyfWo2RuCMPhyVcBcEbUlmccqcG2z5QvDbKTcUQWFk8sILw9v3pc5i49 7NDIB0Lz+aF+6aBx6r2L+kMiWAa2p5Xqmo+urv4v5vkyp2mE8SJUAizbB0IfjM+LWvzp vzojphHk4rj/yUlPp8XsZTpWETcLPKr4mkNdbuHp2RrCKtlIdxKDUutr2jl6i5yN3PXs 68W5cquJzQCWkL3K3veyLiqXLdDj2fEbEYVq/JYfGtJ9+rXZ7vlW6jlP9WY7cnqa0Lml yvng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date:dkim-signature; bh=vZDwfNLVUBH6SGm2FcQoiJLrDZVGLDDEbJUTgzVuEvs=; b=k1nj/F5e0glPxzwH2NMaeTxoon6WzTmIHBVHTJiInlIcfoifsXOl1N88yK/jhLeynV o3Fvi6VPr3KPrSTEfOlYS48Z9QETkOet6h4VYxkWBDGQJGcGmGCVXvPrHUF3s/xl0yS+ P9dA7XP2URCJyN1OsGLt/3xbtjod+uHrCRVmoXNE8lrap0/IHmxOtJ03Yx9H2bGbzjBE hiK09+gBRioo1dUsyo/7NUnDRbN/wJndWWCuTt0XFM9qwcBhUt3qrKzvPxuJKcFUwkcZ WH53b7TVWFPY0j1gKPos4bdSoOVb27C2AuP7QN08LZuBIhJecOs5HjTEq4lTcY6wTnI9 fkdg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=mXd5SVYY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k1si526737oic.245.2020.02.26.17.38.00; Wed, 26 Feb 2020 17:38:11 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=mXd5SVYY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728241AbgB0Bh1 (ORCPT + 99 others); Wed, 26 Feb 2020 20:37:27 -0500 Received: from mail-pj1-f68.google.com ([209.85.216.68]:34899 "EHLO mail-pj1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728091AbgB0Bh1 (ORCPT ); Wed, 26 Feb 2020 20:37:27 -0500 Received: by mail-pj1-f68.google.com with SMTP id q39so439149pjc.0 for ; Wed, 26 Feb 2020 17:37:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=vZDwfNLVUBH6SGm2FcQoiJLrDZVGLDDEbJUTgzVuEvs=; b=mXd5SVYYTzZ17GeQ+3kEKesccxr9w0WIIKJiKARi3WPT7EYMzXv3gdtI/z344fipaC VfFDEZNUUAJyhUSvlQc4+8Db8BZmXG/l9lcTBaOEmXWnqnr/WlcSIbNCRdkap2IIpH9E FVAZisCtnUFEV4sazZsGkW+lwxr54D99nLRMaN3uU2lXNSdMn+0VLmZ63Np7r1BSVP7Q /ZjTH20IRBm80xiwEWzIaeBhFfoXjaRe9mVPo4LZ7kf3OJR3mSGvHUWS2hHW4yao3uL1 TGB6xES+l4itrLSdjHGHyEDg4xOQQ1snSsMA0uIy4MO/gkKahmYYoS/He2VKiWhmVNqW 1mbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=vZDwfNLVUBH6SGm2FcQoiJLrDZVGLDDEbJUTgzVuEvs=; b=YiYYNJ67nylYaRhIXVTE3rpORHS6BdqNdff4JOxb8sowDl5fSF8Vk9XsjUwZkX8jw0 tRaKzDypXQm/+xYsuZythsedOKndy60iRcMn45n0BUN7BGaTOfGXoKbRZXPiJm63UhqW cjiAMDv+x6pqBJ0+DnT0qWdjQQnxKYIs4SC1wZMD8wpzvuIvnyjYoOFutP9g7zagAUaP ovI8KEUS8lLdPvNF36lFXLB1ZxbwrpPAGUwZT+SrD/vNJT29IlpDE1qAbGWTqe70QdAR FR9Rb5xuzsO8SwDQ2z52rpq5HZ5Qe//lNlJtSjw4w/maT06xNQAbgJHZy7xQUlHaDKPU em2Q== X-Gm-Message-State: APjAAAW0WdOFgEvvYi6/JUmfPcNdaShj9SJffmgY/QEnDY6eP2IemEnI /Ygk93M9gZ4E/MKzwb5DqD40xg== X-Received: by 2002:a17:90a:3603:: with SMTP id s3mr2152716pjb.61.1582767444239; Wed, 26 Feb 2020 17:37:24 -0800 (PST) Received: from [100.112.92.218] ([104.133.9.106]) by smtp.gmail.com with ESMTPSA id w11sm4037509pgh.5.2020.02.26.17.37.23 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 26 Feb 2020 17:37:23 -0800 (PST) Date: Wed, 26 Feb 2020 17:37:22 -0800 (PST) From: Hugh Dickins X-X-Sender: hugh@eggly.anvils To: Yang Shi cc: Hugh Dickins , kirill.shutemov@linux.intel.com, aarcange@redhat.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [v2 PATCH] mm: shmem: allow split THP when truncating THP partially In-Reply-To: Message-ID: References: <1575420174-19171-1-git-send-email-yang.shi@linux.alibaba.com> <00f0bb7d-3c25-a65f-ea94-3e2de8e9bcdd@linux.alibaba.com> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 26 Feb 2020, Yang Shi wrote: > On 2/24/20 7:46 PM, Hugh Dickins wrote: > > > > I did willingly call my find_get_entries() stopping at PageTransCompound > > a hack; but now think we should just document that behavior and accept it. > > The contortions of your patch come from the need to release those 14 extra > > unwanted references: much better not to get them in the first place. > > > > Neither of us handle a failed split optimally, we treat every tail as an > > opportunity to retry: which is good to recover from transient failures, > > but probably excessive. And we both have to restart the pagevec after > > each attempt, but at least I don't get 14 unwanted extras each time. > > > > What of other find_get_entries() users and its pagevec_lookup_entries() > > wrapper: does an argument to select this "stop at PageTransCompound" > > behavior need to be added? > > > > No. The pagevec_lookup_entries() calls from mm/truncate.c prefer the > > new behavior - evicting the head from page cache removes all the tails > > along with it, so getting the tails a waste of time there too, just as > > it was in shmem_undo_range(). > > TBH I'm not a fun of this hack. This would bring in other confusion or > complexity. Pagevec is supposed to count in the number of base page, now it > would treat THP as one page, and there might be mixed base page and THP in > one pagevec. I agree that it would be horrid if find_get_entries() and pagevec_lookup_entries() were switched to returning just one page for a THP, demanding all callers to cope with its huge size along with the small sizes of other pages in the vector. I don't know how to get such an interface to work at all: it's essential to be able to deliver tail pages from a requested offset in the compound page. No, that's not what the find_get_entries() modification does: it takes advantage of the fact that no caller expects it to guarantee a full pagevec, so terminates the pagevec early when it encounters any head or tail subpage of the compound page. Then the next call to it (if caller does not have code to skip the extent - which removal of head from page cache does) returns just the next tail, etc, until all have been delivered. All as small pages. (Aside from the comments, I have made one adjustment to what I showed before: though it appears now that hugetlbfs happens not to use pagevec_lookup_entries(), not directly anyway, I'm more comfortable checking PageTransHuge && !PageHuge, so that it would not go one-at-a-time on hugetlbfs pages. But found I was wrong earlier when I said the "page->index + HPAGE_PMD_NR <= end" test needed correcting for 32-bit: it's working on PageHead there, so there's no chance of that "+ HPAGE_PMD_NR" wrapping around: left unchanged, what it's doing is clearer that way than with macros.) Hugh > But, I tend to agree avoiding getting those 14 extra pins at the > first place might be a better approach. All the complexity are used to > release those extra pins.