Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp1702648ybp; Wed, 9 Oct 2019 19:10:29 -0700 (PDT) X-Google-Smtp-Source: APXvYqxHJVzDBqHnAhMhAY4QGJ7NV0dw7Jj4UG81c29B3HEkt6LShx594YyFOlhkZXlL3knL397s X-Received: by 2002:aa7:d753:: with SMTP id a19mr6013354eds.80.1570673429094; Wed, 09 Oct 2019 19:10:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570673429; cv=none; d=google.com; s=arc-20160816; b=y3akpr25STtjqEZd7jHIfVD4fQpUK9AQwRsfLJN7h157r/zjrafdwcxovLe3YN8TpO g0dDIiKx23gqWr19Sjp/Tb481fpqbvQv1QD/g8TX5VrkPmnBPp60ogs7BWGwxHtfF+es RbJGz9hedFXNGtrZHwH8xHSU0pCKitLxXJiBv4rZDmnomsRqIPxK0wQrUM+h6dkmZ+dI N9JIC4MEV1srHCYv1c4yn4wKrNvAG9X+MMdBTH3P2xcr4b/Qf9b6OvX2LX0w6Ep4FOUt ydAKWw5tLR1mrax030a3wIT6d/9uS7x19Ku61PLtYB+JLYLtK0P7sugMkA46bmfRB30E Hcvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=LVTlFWjjXqkfqHeR1S0u6EjRM0YYyhBOrAMAs/nOEWk=; b=Nlr4XuP8s4HEkuytmoWP+ulJrzG+Tgny3jHQh+CEE6l+NSGQFV6jO9UTSW8RiBsoKI OPc5y80ay2jsf9byjnoxokAV1TDrO0etqUbu37fOR/Qdtmjk5hpA31cc7DHLimGY7FJ6 e6L0OKfwna0nj19ALV94FILkmD+y6Sssw6idrX4lxqwYR4tqoUHL/4fPAyEnRfRI0DdF M+grb6qs/pcwTAAHz+rB0clCluPQTfu67kVDqCqDh7p+3k4qylssp/B+RZGqKeQ5HSk+ MGbh+TYVYHKd5KvqndAH6A/bdBwM5K6ZIBY43S850wB4FtU4LLr/AE/THr2wpAKzge3m 7IMw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=BeieFAWy; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q35si2742914eda.122.2019.10.09.19.10.05; Wed, 09 Oct 2019 19:10:29 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=BeieFAWy; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732718AbfJJCJr (ORCPT + 99 others); Wed, 9 Oct 2019 22:09:47 -0400 Received: from mail-lj1-f194.google.com ([209.85.208.194]:39583 "EHLO mail-lj1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726524AbfJJCHr (ORCPT ); Wed, 9 Oct 2019 22:07:47 -0400 Received: by mail-lj1-f194.google.com with SMTP id y3so4477467ljj.6 for ; Wed, 09 Oct 2019 19:07:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=LVTlFWjjXqkfqHeR1S0u6EjRM0YYyhBOrAMAs/nOEWk=; b=BeieFAWyjI1F3VJRLfbX4p2jjIhtEOLplHRHxqZshN8p2DRirvoI5LMUtdoTMVa0jF hNeeEl7AXyvjUh3O8eAOPH51FAs5ujKoALbPurD6W8Utd4Owa6wJx/+3HVjzZH2kn5KZ mdnrWEWC7X8bEP6QvZMTajKS25sCA3oFM31pg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=LVTlFWjjXqkfqHeR1S0u6EjRM0YYyhBOrAMAs/nOEWk=; b=G4XUkYULLM4rk9FCWnxgchy4i2oE9Va84yfsSZuE46aOvRl+rIknyOYHmLIv7iJ+P7 U4DjQAgxjdPOFxm68g9zS33He+DHjgtoC8Z8GiuE7YdYq3lxzM2jufh4TP6UO5epz86t r/FCkAjVNpECMhm61KfuQx5HBkJcx3K2epe7vcm7Q3/YpkR0GJ86Wh2ikPvPvcKhRYvP v3I0bFVENeIGRAjZcsskb+KLrW9csxUdRULEH/SOxQeuDhurov93z+aA06CTxRg/DBiE /qXgTy1/5YM90p8mDM8eDVKQ7zS53SasdoPFyCYTpA29/N36r/4h63OET+7JAJKN6ehk qfJw== X-Gm-Message-State: APjAAAXCk8Ox3t0PCr9SI/tjwLpbWTJzVOnYUzhAyvRi6XYU2cpn7U45 XfcpFOQz8tF7/KeT/GE9K5KZtI3DtIo= X-Received: by 2002:a2e:7312:: with SMTP id o18mr4230249ljc.32.1570673263515; Wed, 09 Oct 2019 19:07:43 -0700 (PDT) Received: from mail-lj1-f181.google.com (mail-lj1-f181.google.com. [209.85.208.181]) by smtp.gmail.com with ESMTPSA id b19sm829129lji.41.2019.10.09.19.07.41 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 09 Oct 2019 19:07:42 -0700 (PDT) Received: by mail-lj1-f181.google.com with SMTP id v24so4503688ljj.3 for ; Wed, 09 Oct 2019 19:07:41 -0700 (PDT) X-Received: by 2002:a2e:29dd:: with SMTP id p90mr4334806ljp.26.1570673261288; Wed, 09 Oct 2019 19:07:41 -0700 (PDT) MIME-Version: 1.0 References: <20191008091508.2682-1-thomas_os@shipmail.org> <20191008091508.2682-4-thomas_os@shipmail.org> <20191009152737.p42w7w456zklxz72@box> <03d85a6a-e24a-82f4-93b8-86584b463471@shipmail.org> <80f25292-585c-7729-2a23-7c46b3309a1a@shipmail.org> <6d3ef513-ca9d-9778-10da-86f368a57cd0@shipmail.org> In-Reply-To: <6d3ef513-ca9d-9778-10da-86f368a57cd0@shipmail.org> From: Linus Torvalds Date: Wed, 9 Oct 2019 19:07:24 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v4 3/9] mm: pagewalk: Don't split transhuge pmds when a pmd_entry is present To: =?UTF-8?Q?Thomas_Hellstr=C3=B6m_=28VMware=29?= Cc: Thomas Hellstrom , "Kirill A. Shutemov" , Linux Kernel Mailing List , Linux-MM , Matthew Wilcox , Will Deacon , Peter Zijlstra , Rik van Riel , Minchan Kim , Michal Hocko , Huang Ying , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 9, 2019 at 6:10 PM Thomas Hellstr=C3=B6m (VMware) wrote: > > Your original patch does exactly the same! Oh, no. You misread my original patch. Look again. The logic in my original patch was very different. It said that - *if* we have a pmd_entry function, then we obviously call that one. And if - after calling the pmd_entry function - we are still a hugepage, then we skip the pte_entry case entirely. And part of skipping is obviously "don't split" - but it never had that "don't split and then call the pte walker" case. - and if we *don't* have a pmd_entry function, but we do have a pte_entry function, then we always split before calling it. Notice the difference? So instead of looking at the return value of the pmd_entry() function, the approach of that suggested patch was to basically say that if the pmd_entry function wants us to go another level deeper and it was a hugepmd, it needed to split the pmd to make that happen. That's actually very different from what your patch did. My original patch never tried to walk the pte level without having one - either it *checked* that it had one, or it split. But I see where you might have misread the patch, particularly if you only looked at it as a patch, not as the end result of the patch. The end result of that patch was this: if (ops->pmd_entry) { err =3D ops->pmd_entry(pmd, addr, next, walk); if (err) break; /* No pte level walking? */ if (!ops->pte_entry) continue; /* No pte level at all? */ if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) continue; } else { if (!ops->pte_entry) continue; split_huge_pmd(walk->vma, pmd, addr); if (pmd_trans_unstable(pmd)) goto again; } err =3D walk_pte_range(pmd, addr, next, walk); and look at thew two different sides of the if-statement: if they get to "walk_pte_range()", both cases wil have verified that there actually _is_ a pte level. They will just have done it differently. - the "we didn't have a pmd function" will have split the pmd if it was a hugepmd, while the "we do have a pmd_entry" case will just check whether it's still a huge-pmd, and done a "continue" if it was and never even tried to walk the ptes. But I think the "change pmd_entry to have a sane return code" is a simpler and more flexible model, and then the pmd_entry code can just let the pte walker split the pmd if needed. So I liked that part of your patch. Linus