Received: by 2002:a05:7412:d1aa:b0:fc:a2b0:25d7 with SMTP id ba42csp2020851rdb; Wed, 31 Jan 2024 17:19:39 -0800 (PST) X-Google-Smtp-Source: AGHT+IEWEcKhgpeyHMt0li3bApSrbA1wtnzl3uPNlTjHnuFyjuwB21yheRfQO5uT7Pj3pzbnMg57 X-Received: by 2002:a05:6808:bd4:b0:3bf:a2cc:2780 with SMTP id o20-20020a0568080bd400b003bfa2cc2780mr292891oik.41.1706750379679; Wed, 31 Jan 2024 17:19:39 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706750379; cv=pass; d=google.com; s=arc-20160816; b=zp2eUlSz4yI185Al40Svjaf4C8dUtenI4IbSkbJ2hbOU+6NIJoyshv8zRAPUajQjMl EgejW/l5c/sO9Mfn0GDDpYpU3M0FDtvy0Fngfz+WckCluGDgUSAKSs4QnU01UxgObyNm fpsi4E1cHo1qmWX0lrtOvZb5uK+/GU78mg66c2/w/jneOdPVgMIboTxfnzVVXst5CSNI b5x7bJx6I2rdlrpkTrgrq6ntplQzV2gL/XLqxQ/sXvL/LL2bJTC8gBBVgj5iv81nA4Ge D0Gw5gftubxPs7q7pCspgwE7LJLNguJhTa4Bjxb1Ma3bYq3XhELCgz3HOPvZoGCdNRUL nokg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=Qu7TDXtuEwstu87+FhD80qzBfPypdOeZfjOo4stpT1Q=; fh=IzF14i6WYjzmoWMtPbrAyvadB/rdUMShAw0j09r4mS4=; b=evjtOuGWiaIk9034m1f0eSXAaYxEX668lv2k7AZYYxLzbMfvqYYjGmN0oBAFGj9WBP EZ2ut77C7ZeL9uE+HVoEIqtUO+sSvXXph/d/j9ywg8k9wbDCOnlnJq6DmjnfgmY0TNhv or9y0TZuOr6eC1450ruf+37jq3AejX5lyxgfOSH1I7WWK3gOcCEscz7E2sXfsl/hRON5 WPSf/iz+V48An9E9sxyZ9A/2MpjKqDqcfTmEq4BboOFKwFUXlJc43XOeV6s2qKzzHEGO J4NWGgTKsMFyfxsn4PVNFuyYfvrxqXM59xDqpPDRVoMzYdFt2GhMHAfvjZ8alGte6S5b BgDg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b="lO3vHBh/"; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-47488-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-47488-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com X-Forwarded-Encrypted: i=1; AJvYcCUqKw63PC5IkziXWoMmHcviCByiYuKPhef15olnEkRftIblmK61hEMe3QPKmOKFOxjLCFyqXrEquTbeM0e2cVaLPKYt/RGGtj8h//JeDg== Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id n185-20020a6327c2000000b005cfd6ba3551si10944122pgn.20.2024.01.31.17.19.39 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Jan 2024 17:19:39 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-47488-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b="lO3vHBh/"; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-47488-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-47488-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id F127A29632E for ; Thu, 1 Feb 2024 01:15:20 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 6F47C6D39; Thu, 1 Feb 2024 01:13:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lO3vHBh/" Received: from mail-yb1-f174.google.com (mail-yb1-f174.google.com [209.85.219.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 805793FDB for ; Thu, 1 Feb 2024 01:13:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.174 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706749997; cv=none; b=lulSl2YIvHi7EyX7yUw8LPs9NvSkKO6O6hsGo0OL1O8ZAygWNF3ypQo5FPyaL7/jtoTc8YuT+D7i4B+kzZkCvVFkItE4Cu6tLDtwN2kxu8WJoGy/uCdrTQ9Sk+CUi5Bg09e4dszN1Oh1WfLNFmcMwatnwA9dAAiQwwmDJwN9Wm4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706749997; c=relaxed/simple; bh=oZliER3IeF6JR3KFG9HV9w9b8UCLARiIQIygHLzMaS4=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=sopwyhyG7yfuZ7/zOzwafAhNGTcYAj7252Yuq1VmxmUcwHH5UxW45eHAA6wMItewghPsIUAr5sb7j00RGWmNB2EInh1MbgnyDvTgECb5O/qz9tEPCcgvF4cfomUBU1Q6h3MkIeBGTUE5KswbHfkBX2OkKDwWNsqmP5sbXsHPuS8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lO3vHBh/; arc=none smtp.client-ip=209.85.219.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yb1-f174.google.com with SMTP id 3f1490d57ef6-db3a09e96daso382867276.3 for ; Wed, 31 Jan 2024 17:13:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706749994; x=1707354794; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Qu7TDXtuEwstu87+FhD80qzBfPypdOeZfjOo4stpT1Q=; b=lO3vHBh/o3t8rW0iNv9AWjHOwtZ8DmP1C72cPx34843myRyoEqj0CxA80lkDr6eE8V 6+CgXZnEQSYx08hr8C0A3fiaI2cEXUSAdFJKpsYHcLkOJostiU1Ee47NTM1GbzPKau3i LMbBLc1LqeEsFeTSqw2cIJ+kVad13i4AhsptLSFRZAyMFM9Y6hW+u377Q1xCGvfS6nrP ELeoj55FbpF7dsiT+045x44PguWgJAnBmedsL+0LwJjL/EJr+8qKUzT7lxHcm9WMH2e5 n3tMHeU2khZ0xwoQAjDdCFxAFCQF3nfNeSeggLjVo8WmeUapkouf7y0EQSNt74WdGWru +mmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706749994; x=1707354794; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Qu7TDXtuEwstu87+FhD80qzBfPypdOeZfjOo4stpT1Q=; b=bLbREGEPkNVGpZok68HhLnL/2uWNLntoiiqmJROp2ky7x7ZkMHuEv8T23X9nedMnDl 4CEaeix3hXjMsGqSS5C0jVFJVDhBdT6FnTO9BWIDaCcgO38Du70VMUDiJhb4RIRTdmXj u0LFPQIAJqUFKxsiC7wotOYKZE9wWUjPzTD/znFaiwZl6+eLDukAZz2FZWupB+v6eMXg kIjRYD/iB8nfuuy8aLNROpJHwV5agoRtbvDlJ08fsYyXGB1DJphBlDfDUrIo5mf+lw2D fdPUPXq2ANAFRMq6JhcNu7cY5rO21nSc+C9qe0EZaMpDMtw9VNx8VhwwUGa9emR79OtR Lj2Q== X-Gm-Message-State: AOJu0YxQQwkWNlcpt5qhpUSVhFtaF4S2fynKPFvT5tApYeP9x7aL7ak1 mghX20C8+1fj2vX2RjKNOa4IaYnOJ8ccn4b0bKWjFeAwgSypwG41miSt/bFqlnDoy5SU8eXGWVi FmQ+k+sJ+W1/QA8itD3aiQ5UjdOM= X-Received: by 2002:a25:41d0:0:b0:dc2:5573:42df with SMTP id o199-20020a2541d0000000b00dc2557342dfmr3496260yba.25.1706749994147; Wed, 31 Jan 2024 17:13:14 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240129054551.57728-1-ioworker0@gmail.com> In-Reply-To: From: Lance Yang Date: Thu, 1 Feb 2024 09:13:02 +0800 Message-ID: Subject: Re: [PATCH 1/1] mm/khugepaged: bypassing unnecessary scans with MMF_DISABLE_THP check To: Yang Shi Cc: akpm@linux-foundation.org, mhocko@suse.com, zokeefe@google.com, david@redhat.com, songmuchun@bytedance.com, peterx@redhat.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hey Yang, Thank you for the clarification. You're correct. If the daemon calls prctl with MMF_DISABLE_THP before fork, the child mm won't be on the hash list. What I meant is that the daemon mm might already be on the hash list before fork. Therefore, khugepaged might still scan the address space for the daemon. Thanks, Lance On Thu, Feb 1, 2024 at 4:06=E2=80=AFAM Yang Shi wrote= : > > On Wed, Jan 31, 2024 at 1:30=E2=80=AFAM Lance Yang = wrote: > > > > Updating the change log. > > > > khugepaged scans the entire address space in the > > background for each given mm, looking for > > opportunities to merge sequences of basic pages > > into huge pages. However, when an mm is inserted > > to the mm_slots list, and the MMF_DISABLE_THP > > flag is set later, this scanning process becomes > > unnecessary for that mm and can be skipped to > > avoid redundant operations, especially in scenarios > > with a large address space. > > > > This commit introduces a check before each scanning > > process to test the MMF_DISABLE_THP flag for the > > given mm; if the flag is set, the scanning process is > > bypassed, thereby improving the efficiency of khugepaged. > > > > This optimization is not a correctness issue but rather an > > enhancement to save expensive checks on each VMA > > when userspace cannot prctl itself before spawning > > into the new process. > > If this is an optimization, you'd better show some real numbers to help j= ustify. > > > > > On some servers within our company, we deploy a > > daemon responsible for monitoring and updating local > > applications. Some applications prefer not to use THP, > > so the daemon calls prctl to disable THP before fork/exec. > > Conversely, for other applications, the daemon calls prctl > > to enable THP before fork/exec. > > If your daemon calls prctl with MMF_DISABLE_THP before fork, then you > end up having the child mm on the hash list in the first place, I > think it should be a bug in khugepaged_fork() IIUC. khugepaged_fork() > should check this flag and bail out if it is set. Did I miss > something? > > > > > Ideally, the daemon should invoke prctl after the fork, > > but its current implementation follows the described > > approach. In the Go standard library, there is no direct > > encapsulation of the fork system call; instead, fork and > > execve are combined into one through syscall.ForkExec. > > > > Thanks, > > Lance > > > > On Mon, Jan 29, 2024 at 1:46=E2=80=AFPM Lance Yang wrote: > > > > > > khugepaged scans the entire address space in the > > > background for each given mm, looking for > > > opportunities to merge sequences of basic pages > > > into huge pages. However, when an mm is inserted > > > to the mm_slots list, and the MMF_DISABLE_THP flag > > > is set later, this scanning process becomes > > > unnecessary for that mm and can be skipped to avoid > > > redundant operations, especially in scenarios with > > > a large address space. > > > > > > This commit introduces a check before each scanning > > > process to test the MMF_DISABLE_THP flag for the > > > given mm; if the flag is set, the scanning process > > > is bypassed, thereby improving the efficiency of > > > khugepaged. > > > > > > Signed-off-by: Lance Yang > > > --- > > > mm/khugepaged.c | 18 ++++++++++++------ > > > 1 file changed, 12 insertions(+), 6 deletions(-) > > > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > > index 2b219acb528e..d6a700834edc 100644 > > > --- a/mm/khugepaged.c > > > +++ b/mm/khugepaged.c > > > @@ -410,6 +410,12 @@ static inline int hpage_collapse_test_exit(struc= t mm_struct *mm) > > > return atomic_read(&mm->mm_users) =3D=3D 0; > > > } > > > > > > +static inline int hpage_collapse_test_exit_or_disable(struct mm_stru= ct *mm) > > > +{ > > > + return hpage_collapse_test_exit(mm) || > > > + test_bit(MMF_DISABLE_THP, &mm->flags); > > > +} > > > + > > > void __khugepaged_enter(struct mm_struct *mm) > > > { > > > struct khugepaged_mm_slot *mm_slot; > > > @@ -1422,7 +1428,7 @@ static void collect_mm_slot(struct khugepaged_m= m_slot *mm_slot) > > > > > > lockdep_assert_held(&khugepaged_mm_lock); > > > > > > - if (hpage_collapse_test_exit(mm)) { > > > + if (hpage_collapse_test_exit_or_disable(mm)) { > > > /* free mm_slot */ > > > hash_del(&slot->hash); > > > list_del(&slot->mm_node); > > > @@ -2360,7 +2366,7 @@ static unsigned int khugepaged_scan_mm_slot(uns= igned int pages, int *result, > > > goto breakouterloop_mmap_lock; > > > > > > progress++; > > > - if (unlikely(hpage_collapse_test_exit(mm))) > > > + if (unlikely(hpage_collapse_test_exit_or_disable(mm))) > > > goto breakouterloop; > > > > > > vma_iter_init(&vmi, mm, khugepaged_scan.address); > > > @@ -2368,7 +2374,7 @@ static unsigned int khugepaged_scan_mm_slot(uns= igned int pages, int *result, > > > unsigned long hstart, hend; > > > > > > cond_resched(); > > > - if (unlikely(hpage_collapse_test_exit(mm))) { > > > + if (unlikely(hpage_collapse_test_exit_or_disable(mm))= ) { > > > progress++; > > > break; > > > } > > > @@ -2390,7 +2396,7 @@ static unsigned int khugepaged_scan_mm_slot(uns= igned int pages, int *result, > > > bool mmap_locked =3D true; > > > > > > cond_resched(); > > > - if (unlikely(hpage_collapse_test_exit(mm))) > > > + if (unlikely(hpage_collapse_test_exit_or_disa= ble(mm))) > > > goto breakouterloop; > > > > > > VM_BUG_ON(khugepaged_scan.address < hstart || > > > @@ -2408,7 +2414,7 @@ static unsigned int khugepaged_scan_mm_slot(uns= igned int pages, int *result, > > > fput(file); > > > if (*result =3D=3D SCAN_PTE_MAPPED_HU= GEPAGE) { > > > mmap_read_lock(mm); > > > - if (hpage_collapse_test_exit(= mm)) > > > + if (hpage_collapse_test_exit_= or_disable(mm)) > > > goto breakouterloop; > > > *result =3D collapse_pte_mapp= ed_thp(mm, > > > khugepaged_scan.addre= ss, false); > > > @@ -2450,7 +2456,7 @@ static unsigned int khugepaged_scan_mm_slot(uns= igned int pages, int *result, > > > * Release the current mm_slot if this mm is about to die, or > > > * if we scanned all vmas of this mm. > > > */ > > > - if (hpage_collapse_test_exit(mm) || !vma) { > > > + if (hpage_collapse_test_exit_or_disable(mm) || !vma) { > > > /* > > > * Make sure that if mm_users is reaching zero while > > > * khugepaged runs here, khugepaged_exit will find > > > -- > > > 2.33.1 > > >