Received: by 2002:a05:7412:bbc7:b0:fc:a2b0:25d7 with SMTP id kh7csp762899rdb; Fri, 2 Feb 2024 03:23:36 -0800 (PST) X-Google-Smtp-Source: AGHT+IEbzkAU1B/fQsIs9EyjpOtx26gMiADsNdM0WGYDAmspuh/LjIJVyHBwZVMPFoN9Vr71MBMY X-Received: by 2002:a05:6e02:971:b0:363:b273:6188 with SMTP id q17-20020a056e02097100b00363b2736188mr1290956ilt.25.1706873015907; Fri, 02 Feb 2024 03:23:35 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706873015; cv=pass; d=google.com; s=arc-20160816; b=EaWRqQ0l/Wuad/+MTgn0DHEJjlg6dZqAfgkIr3taXx2oWIQqMLR0UyT9u7ZpwqAaW9 m80xCA81oOhsg817qGXcskZu+XoAcxuqF3+y47PPOdMk6eI48X5KsZADdFOXFmqnsHYh /nWxzEaqRnzUlO5A4K4KV8gAwHIDRDDlg6Boh/fJetNYTAdBWReXDS4A2oDiBOYuiBuB 9AZ1A9zh/BVLxwVySuLy2EnQYEQPV5X36Rp9jGyQeit9JWWCzLf4QGrje6s77urDrRfp TvvnSv84f7WPGng0fOKWY21auk56J2+6NeVfPQwlJ+Gyf178wOAaLd5luVsqnjiBkED1 kOOg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=1KOce/rNIsOTa8oBL/B9soOdUTgL+D05JTqUDqwEEKQ=; fh=tGLRAgt5GQ2FFWCg8k8NTRs+lN5EmwYq8AsloXOkM1c=; b=UzHnUQlbQ68QGIwmtD1tXXWm9f8Xgw3iQXNH+W9VqzmLj/MHteipxPuAkfUERbbXZM bCSX1bKVQt22hFxjgiMDnuVBtBYrkW/By6VI7nZz9hrF6BW8wDaNOLETQInNB7kGduyf hMo4Y+nM8TVyqBiZN1M19KK69T+t872WhVr1/0rkqOm4yB8ZHEUM9xO9V/Vrxh5PROpY JVfopuG/J6YVPVtrRtHGEVn28pB8NtkNNylfYMq+HFKtlTieSf2LJwLoFyT/BXXnoF4F I+tp2A+rSySC4YyQI41OuIq1I7PTNkuZiQIgV5hK9Gc8fgvEDwhmnqpW/GssKucOrQZ5 AZeA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=kKjTR9tL; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-49743-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-49743-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com X-Forwarded-Encrypted: i=1; AJvYcCWmbfFLSMYotpWnj+ZV/2uONJ3V8gcOjDU/Ai+M4+xQDwAMZLnJQ626b9uRyIaCc5Xb7Bz/TjUnZ0L7xVOD1P6sTqfg+aSYo2EJTtZxFQ== Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id q17-20020a63d611000000b005d8b313de29si1455501pgg.650.2024.02.02.03.23.35 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Feb 2024 03:23:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-49743-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=kKjTR9tL; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-49743-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-49743-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 8860A2889D8 for ; Fri, 2 Feb 2024 11:23:35 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id DDF5A13D4ED; Fri, 2 Feb 2024 11:23:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kKjTR9tL" Received: from mail-yb1-f178.google.com (mail-yb1-f178.google.com [209.85.219.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 866405C5E6 for ; Fri, 2 Feb 2024 11:23:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.178 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706873008; cv=none; b=UQhAwholw8VQcA4dV3xy+gLzXODPtmltazWxvnkNnEV7NfZBk3H8vz0Jv5PQvLR3u0wOS/4qJhVoP3uN5BUmHF8YnpSbMwEoVsUSaz6gT/Nu4kBckMFckiXQrcn+AzPeNCR/CEHjjj7aLWv040vrni7ymgFhT+g2UEAJNLof5Xw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706873008; c=relaxed/simple; bh=hUN1VM5UDQkrAqzi8zSV71Ge5ARXZHbYNIWZQAH45Bc=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=CQMoT8uuoXLV58EpmlaVvWjoDAUnYjCu8sKBMPr3lgkqVtlsGRJTHXC3HlxSfmdHtTzqo6bdHgjvdO4NyCGePJMiz8a3zD13yu1h5mQCJN9NRkr1QGlPVq90GdCzuJhKPmKrLx+7HbAZNpChgec3ITo5BS+xoDEBlobVIABKV7k= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=kKjTR9tL; arc=none smtp.client-ip=209.85.219.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yb1-f178.google.com with SMTP id 3f1490d57ef6-dc6d7e3b5bfso1669409276.2 for ; Fri, 02 Feb 2024 03:23:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706873005; x=1707477805; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=1KOce/rNIsOTa8oBL/B9soOdUTgL+D05JTqUDqwEEKQ=; b=kKjTR9tLD8d8xH3uqqJvfEw+ZFICDXsybF/3wmU354XlTNMCbjqYjCt7ginovL47bM J7EqA4l3xjXZ4R4ScCtHzqgI6d3Mwyrep7nalksWtsHl/1k2HEN5yw53ZcDayUl7fvhg 5JhVedBaxSDN2bwOB3JQUh1YRUKx8uDmdQ4aryUoiSVifVgjwQ8G8gQ1cXeVx1PsM8cs 9Kar3EEEG/pnnz3F4XzBuewq5Mm31RHBZrnCkf2e//fLISsUSJzim4o+uMVT7VTK6ZJZ 4tu16A4rXbduDEmRN/l1G6cKZbbliuUVLVvcwsndycXF6ksRdNGLIhgCxb4riOVuxTzu siWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706873005; x=1707477805; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1KOce/rNIsOTa8oBL/B9soOdUTgL+D05JTqUDqwEEKQ=; b=ecXnPkEkqWQtnGNZfmgW84mRlzHnvgc0zeNrVd1/TghwFridjbPqQofiUS1zOrr8Tb QdFavVCTYVy7OcWycjRiEtmRm520rswDaVjpHVEaKbxjo+5gJPAoBUO0yD3VkUyU85mr q9GOhnfeYHEYZPDRTU8zUO1IGWLSpBF2eExZqZKioK9uB66uZwEd7qbfU2jxrYZk9rwt FRQQE1i9lpxADPNIiCSm7KSxhZk/2TS0Wcpz+jnH5NfplELlYtANnHhRZ2LliQN3EFlH L4VqcIffYrzj2kACrev/MX8A0RwmTBp2TZqsABoS3A4UhxCeoDlXnyQWwUBLOIDKqll5 olBw== X-Gm-Message-State: AOJu0YxROYlsJzYEaGYh33+oO5IELqgDfq6HzFaQy9bcMfMlQKWLZ/te H+JGR4UchU7dcaVkJ0k1WasL79tD7qt0ifGebKLcLMsrGZA8ThNsj+zAepDUywyENgqVOrl+WIb 75GBAXGCyBWPtXnsXXpDfUGF1gaM= X-Received: by 2002:a25:aaed:0:b0:dc6:a5e1:3a05 with SMTP id t100-20020a25aaed000000b00dc6a5e13a05mr5181665ybi.14.1706873005389; Fri, 02 Feb 2024 03:23:25 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240201125226.28372-1-ioworker0@gmail.com> In-Reply-To: From: Lance Yang Date: Fri, 2 Feb 2024 19:23:13 +0800 Message-ID: Subject: Re: [PATCH 1/1] mm/khugepaged: skip copying lazyfree pages on collapse To: Yang Shi Cc: akpm@linux-foundation.org, mhocko@suse.com, zokeefe@google.com, david@redhat.com, songmuchun@bytedance.com, peterx@redhat.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, Feb 2, 2024 at 4:37=E2=80=AFAM Yang Shi wrote= : > > On Thu, Feb 1, 2024 at 4:53=E2=80=AFAM Lance Yang w= rote: > > > > The collapsing behavior of khugepaged with pages > > marked using MADV_FREE might cause confusion > > among users. > > > > For instance, allocate a 2MB chunk using mmap and > > later release it by MADV_FREE. Khugepaged will not > > collapse this chunk. From the user's perspective, > > it treats lazyfree pages as pte_none. However, > > for some pages marked as lazyfree with MADV_FREE, > > khugepaged might collapse this chunk and copy > > these pages to a new huge page. This inconsistency > > in behavior could be confusing for users. > > > > After a successful MADV_FREE operation, if there is > > no subsequent write, the kernel can free the pages > > at any time. Therefore, in my opinion, counting > > lazyfree pages in max_pte_none seems reasonable. > > > > Perhaps treating MADV_FREE like MADV_DONTNEED, not > > copying lazyfree pages when khugepaged collapses > > huge pages in the background better aligns with > > user expectations. > > > > Signed-off-by: Lance Yang > > --- > > mm/khugepaged.c | 10 +++++++++- > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > index 2b219acb528e..6cbf46d42c6a 100644 > > --- a/mm/khugepaged.c > > +++ b/mm/khugepaged.c > > @@ -777,6 +777,7 @@ static int __collapse_huge_page_copy(pte_t *pte, > > pmd_t orig_pmd, > > struct vm_area_struct *vma, > > unsigned long address, > > + struct collapse_control *cc, > > spinlock_t *ptl, > > struct list_head *compound_pagelis= t) > > { > > @@ -797,6 +798,13 @@ static int __collapse_huge_page_copy(pte_t *pte, > > continue; > > } > > src_page =3D pte_page(pteval); > > + > > + if (cc->is_khugepaged > > + && !folio_test_swapbacked(page_folio(sr= c_page))) { > > + clear_user_highpage(page, _address); > > + continue; > > If the page was written before khugepaged collapsed it, and khugepaged > collapsed the page before memory reclaim kicked in, didn't this > somehow cause data corruption? > Thanks a lot! Yang, you're correct; indeed, there is a potential issue with data corruption. I took a look at the check for lazyfree pages in smaps_pte_entry. Here's the modification: if (cc->is_khugepaged && !PageSwapBacked(src_page) && !pte_dirty(pteval) && !PageDirty(src_page)) { clear_user_highpage(page, _address); continue; } Could you please take a look? Thanks, Lance > > + } > > + > > if (copy_mc_user_highpage(page, src_page, _address, vma= ) > 0) { > > result =3D SCAN_COPY_MC; > > break; > > @@ -1205,7 +1213,7 @@ static int collapse_huge_page(struct mm_struct *m= m, unsigned long address, > > anon_vma_unlock_write(vma->anon_vma); > > > > result =3D __collapse_huge_page_copy(pte, hpage, pmd, _pmd, > > - vma, address, pte_ptl, > > + vma, address, cc, pte_ptl, > > &compound_pagelist); > > pte_unmap(pte); > > if (unlikely(result !=3D SCAN_SUCCEED)) > > -- > > 2.33.1 > >