Received: by 2002:a05:7412:bbc7:b0:fc:a2b0:25d7 with SMTP id kh7csp181066rdb; Thu, 1 Feb 2024 05:50:20 -0800 (PST) X-Google-Smtp-Source: AGHT+IHW9pW7bIdnaWfuT4A4fTa3D/AkS39R0fpy4IIqRjgSKL20q7i1sX1rzJ+wbAWib7rJ9ZVn X-Received: by 2002:a05:620a:530f:b0:785:3ca4:cc0c with SMTP id oo15-20020a05620a530f00b007853ca4cc0cmr3047934qkn.7.1706795420499; Thu, 01 Feb 2024 05:50:20 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706795420; cv=pass; d=google.com; s=arc-20160816; b=FX5INW3dmolXN4RjaTIoet0orKrgDOdTuprZHv262axyd8D9/Hly58NsG/QvBUH0Cf X1PvQPpDKtluUz6j9R0aDD6Tis9012rqae+R+BV6epBXRkTeFRpwxm1cC5Hu7GML8LsH UX8FTcrXC1dqJA6/LobMAaxbLu4uyUjSsdI1dgV592RwpVy7yAkmobqeZAkRAc3i+HD0 l9MY6RVwB6igU6CewGKSs/+4o/9ZyV42r7toIR3bPmgD+WcNY9ItIp3ombH1xNnbdj1P moSJHs75syL6WTHBndeFuxw5jpKavuEVgmdVVXySDM3s3D8VusJsL2++xvhLpTT3H5b5 Lz7Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=hhhpValrnXg9wEmbJjid2X14ibj0VJ0zXjyvvq0lkME=; fh=U/cdYZZRRD/ZstB+Bi4reBF4OSwe3vKe1/pGA+Rk6/k=; b=wesxHBHjJP4KRmB0OcY56zh1F+SrKqtHzcwwhuKt1yENfDTBxq11nC1m+hqtdv7pKf XzxiTtQ++9NacgOSmjZS2LfgkgfEUcTKIqUe2JgNDMxOyDP8Iwqh8PqhjzDkA1Cmtd4o hUXPGv2PWxUjd2Wb6OufKIkIH1WTC5qUBipgbRN8RP9ICMt/CD/ZQxK05VZHlyWgAL8t m5AjGpj/BTeZ6MtFN80y8PlWCD+Y+tNH6bbJEIYjX0VRMDt1alAlGrfMLQv0gfiUjp2h Y0Z6z55LUfjw/gdg71xt/77RVSnGsELxCRUCmfShp3+HvJ01aPEo8mQPP/iJKQ2eondg MQyA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=YtaRQWGz; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-48264-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-48264-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com X-Forwarded-Encrypted: i=1; AJvYcCUe6aZIjdkxr3ZQTMIsKAX7zoVkw+YRuOp5b+MGENxwAx7H5+2qbVnR9niWLtf2yOXuXEJsqSxDKeEAJ+28NPVeLUjFrg667oXaJFdTag== Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id g10-20020a05620a13ca00b0078409f76822si6205681qkl.105.2024.02.01.05.50.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Feb 2024 05:50:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-48264-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=YtaRQWGz; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-48264-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-48264-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id DF7F41C26C4F for ; Thu, 1 Feb 2024 13:50:07 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 01BBC5CDEB; Thu, 1 Feb 2024 13:50:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="YtaRQWGz" Received: from mail-yb1-f173.google.com (mail-yb1-f173.google.com [209.85.219.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 948CE5339F for ; Thu, 1 Feb 2024 13:49:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.173 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706795401; cv=none; b=qLSSjcTI4OLyU2fBxnH7FNiVDISXkQr49OsfCNVVrABW1BPNT1Kt/1v7VFAQrbgYfO7rzgwXQiBouXR4/bc1GPlbbnAxvqjzh3tG+1Oi4XF8DuaqSPEqhBiR7VVsffU4mn9NfXyHOtJEehcahWhut38ZSZuep5cKSkwU5zVEQcA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706795401; c=relaxed/simple; bh=A7q7dngtZQ5X9XCAxm/FHS/MVGl/uIDzqSXQPejvulE=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=CYfym91CgpzG1ILy1wG+6osvaiB8/puYa10iC49XHzJ1ryFVrFyoRygVxInylzALOIkeK0YHHtQiT/WLBt93xpJWfm4dyxBcqU/7YltwoAZhdWCFed428R7atpMiGAhxwZIM+nR94ZUvS2DIVKYP9xP6qq1zQJqFHOwhZhY7RG8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=YtaRQWGz; arc=none smtp.client-ip=209.85.219.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yb1-f173.google.com with SMTP id 3f1490d57ef6-dc23bf7e5aaso937756276.0 for ; Thu, 01 Feb 2024 05:49:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706795398; x=1707400198; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=hhhpValrnXg9wEmbJjid2X14ibj0VJ0zXjyvvq0lkME=; b=YtaRQWGzDJBZRhoVfQKAJYZPF/w1amwiYRsIAcLfSttVW3TtQoeVZ9u0QRZekPLg8a xwvgo2zLhq958a4a0kiabk3cAbjMAHNxnqyteY0lHCSOr/VjvAmNXrDY6OWKk6ZdPEQR UueRXhsfObnWVAdXBzy2Q9OPJnQTTMRTQyTBeslQIXzJ68I6Tj5MlKIVXLIUcTku1Rt7 cPUlzJt/ZWZ75pTkdTL3YU6O9npA+Vbs78/XX21gH/lCNBhelTtHdaSt75mlESyiy3wn LcTa6NDCffH/ao2fD8rEEK6YQIsuJvXrRFt9cVcyqN/fkAqEtGHXB4CGSGOjrmpKX0ir ShCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706795398; x=1707400198; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hhhpValrnXg9wEmbJjid2X14ibj0VJ0zXjyvvq0lkME=; b=ai7nKT+WxJ5waKe29KlMC79iVG2+vpngLj/D0adYVH3DtwmAPcajzPvyVS/ai0LmUV Kvw9eMVZN71tBMvB2ImAxUmdmQfi8xKw/D/7DBndEvlgX8TEoXrznzfYmle635nAt+Kg +StC/Pe7K4fVz6Q8q9Mn5ENFHzhDeZwBGfismNHBhbsuGlvVGR9fe5jaLfQZHte30t1z Ns1XmAPBuvc8tSPllwRfkJQEnUjLz+GBmbhx6ATqRLsVWc2f1KUafdMqHL7Q5af3J2JO I6BxjTMq2ETk61eAVHhxPAYOgYWoWoqnZmQN131DkXI00X0nxtHY0KsGTG9Ol+ZNFhH1 OeVg== X-Gm-Message-State: AOJu0YzExIJM/pFfRrXnyZHeKv+KAkRpGuuN2rspfa4iXOJOyQF6z6BT fZuKmmhNcxpoQkUbjjlgRcLcEdBSjQGpjf6tJaG/SIpjPr9OgABwYrwyqzdWgSk5M/Uo+SkgVx7 syny1r+L2F5kh8RpxGE3fI81gYyI= X-Received: by 2002:a25:acc1:0:b0:db9:8670:5a7f with SMTP id x1-20020a25acc1000000b00db986705a7fmr4926802ybd.14.1706795398274; Thu, 01 Feb 2024 05:49:58 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240201125226.28372-1-ioworker0@gmail.com> In-Reply-To: <20240201125226.28372-1-ioworker0@gmail.com> From: Lance Yang Date: Thu, 1 Feb 2024 21:49:47 +0800 Message-ID: Subject: Re: [PATCH 1/1] mm/khugepaged: skip copying lazyfree pages on collapse To: Andrew Morton Cc: Michal Hocko , David Hildenbrand , Minchan Kim , "Zach O'Keefe" , Peter Xu , Muchun Song , Yang Shi , linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable THP: enable=3Dmadivse defrag=3Ddefer max_ptes_none=3D511 scan_sleep_millisecs=3D1000 alloc_sleep_millisecs=3D1000 Test code: // allocate a 2MB chunk using mmap and later // release it by MADV_FRE Link: https://github.com/ioworker0/mmapvsmprotect/blob/main/test5.c root@x:/tmp# ./a.out | grep -B 23 hg 7f762a200000-7f762a400000 rw-p 00000000 00:00 0 Size: 2048 kB Anonymous: 2048 kB LazyFree: 2048 kB AnonHugePages: 0 kB THPeligible: 1 VmFlags: rd wr mr mw me ac sd hg // allocate a 2MB chunk using mmap and later // some pages marked as lazyfree with MADV_FREE Link: https://github.com/ioworker0/mmapvsmprotect/blob/main/test4.c root@x:/tmp# ./a.out | grep -B 23 hg 7f762a200000-7f762a400000 rw-p 00000000 00:00 0 Size: 2048 kB Anonymous: 2048 kB LazyFree: 0 kB AnonHugePages: 2048 kB THPeligible: 1 VmFlags: rd wr mr mw me ac sd hg root@x:/tmp# ./a.out [...] root@x:/tmp# echo $? 2 On Thu, Feb 1, 2024 at 8:53=E2=80=AFPM Lance Yang wro= te: > > The collapsing behavior of khugepaged with pages > marked using MADV_FREE might cause confusion > among users. > > For instance, allocate a 2MB chunk using mmap and > later release it by MADV_FREE. Khugepaged will not > collapse this chunk. From the user's perspective, > it treats lazyfree pages as pte_none. However, > for some pages marked as lazyfree with MADV_FREE, > khugepaged might collapse this chunk and copy > these pages to a new huge page. This inconsistency > in behavior could be confusing for users. > > After a successful MADV_FREE operation, if there is > no subsequent write, the kernel can free the pages > at any time. Therefore, in my opinion, counting > lazyfree pages in max_pte_none seems reasonable. > > Perhaps treating MADV_FREE like MADV_DONTNEED, not > copying lazyfree pages when khugepaged collapses > huge pages in the background better aligns with > user expectations. > > Signed-off-by: Lance Yang > --- > mm/khugepaged.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 2b219acb528e..6cbf46d42c6a 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -777,6 +777,7 @@ static int __collapse_huge_page_copy(pte_t *pte, > pmd_t orig_pmd, > struct vm_area_struct *vma, > unsigned long address, > + struct collapse_control *cc, > spinlock_t *ptl, > struct list_head *compound_pagelist) > { > @@ -797,6 +798,13 @@ static int __collapse_huge_page_copy(pte_t *pte, > continue; > } > src_page =3D pte_page(pteval); > + > + if (cc->is_khugepaged > + && !folio_test_swapbacked(page_folio(src_= page))) { > + clear_user_highpage(page, _address); > + continue; > + } > + > if (copy_mc_user_highpage(page, src_page, _address, vma) = > 0) { > result =3D SCAN_COPY_MC; > break; > @@ -1205,7 +1213,7 @@ static int collapse_huge_page(struct mm_struct *mm,= unsigned long address, > anon_vma_unlock_write(vma->anon_vma); > > result =3D __collapse_huge_page_copy(pte, hpage, pmd, _pmd, > - vma, address, pte_ptl, > + vma, address, cc, pte_ptl, > &compound_pagelist); > pte_unmap(pte); > if (unlikely(result !=3D SCAN_SUCCEED)) > -- > 2.33.1 >