Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp3545559pxb; Mon, 24 Jan 2022 11:54:25 -0800 (PST) X-Google-Smtp-Source: ABdhPJw/M+WYOaNwe0tUlB4FhekDj/HYDIzQGf3H8IAKPNHgHxUIj56SGcKGuyolWLaCDWD0vS91 X-Received: by 2002:a63:750d:: with SMTP id q13mr12595059pgc.560.1643054065412; Mon, 24 Jan 2022 11:54:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643054065; cv=none; d=google.com; s=arc-20160816; b=yqjL84fTUppw92sAmQf2PHTque1CR+pc1JzH8bK/hszcS7FVeR31fF/13FgzeJotni 1QQc0IcfV8C9fD3DqFkoi/60s+ObLIRzo+jd6+cXsTgBK47iQuH+egb84GnBMgtbrBFp cUHTbF43the2xnVkiZ9ZyEAFfJc0r/ewcjcII5jr2Mah7V7P5fQKmGUsgB2yszVJMj1I 0lLpbHHtlKqgeFxFehx6nBcpbmwsi/61QeiTfJP9SWPL+Rt5TT1hkbbGmwhCVzNctqmb 98qYkeBhWlgwydj3RrIe7BbGMqV+3HuCnwD55lKg0iVtgRST6ZlYYXJdqPBMj72wDxCN owig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:message-id:in-reply-to :subject:cc:to:from:date:dkim-signature; bh=Gzn3uL4rVVdIQDAl4BqctHY3PBm5IhFX7n3JXbdsza8=; b=jS1tFF5eic16UGT7vOCevBbTsRnHaXwVSzDytfYI+U6v0yzYjV3hVSZwy3LWaPOu/W G/3va8FCE1n7NHqkpUg2F25ev86UnIqhWRxkaboc6k1STFptj8y38EDBzvcNv4mXThzo UWjjeiUBug0aw2Z3tEbR9TYdE7nz2F2D/AZVjdjnr6E+2K63pkAQsNczL9Wfb5LEi7AX 4Tgo7NX7wHFjwkuFrJ+VZja81XrTlaPHyKug1XEjSh2JNIXnjXf1tv7jk3ln+pZTN0Ay OpOHLyZdeRji+bmSN+vONylnqRv8A0tWZj853cEEvRqH6A6zJ3EiNKPLf6ZD0V5ytbgB U1Zw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Qh+FuWF8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k129si10639747pfd.300.2022.01.24.11.54.13; Mon, 24 Jan 2022 11:54:25 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Qh+FuWF8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245314AbiAXStv (ORCPT + 99 others); Mon, 24 Jan 2022 13:49:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42678 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245375AbiAXSs6 (ORCPT ); Mon, 24 Jan 2022 13:48:58 -0500 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 29ABDC061751 for ; Mon, 24 Jan 2022 10:48:58 -0800 (PST) Received: by mail-pl1-x634.google.com with SMTP id z5so2734594plg.8 for ; Mon, 24 Jan 2022 10:48:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version; bh=Gzn3uL4rVVdIQDAl4BqctHY3PBm5IhFX7n3JXbdsza8=; b=Qh+FuWF89aMBe6tH8YCR707h4Gxwg0Ls7NK8D6oyRrTxLavxgit+eOxbNfYobiPB1P i257nnXOJ8+q5A2mnJooBL3VgJ9yLW8Gt8uzuRkJOzrL4ED+g0coDI1iBYnSaYVekOi/ /S3xOTDv41y8sYC00uwEtka6RT+mYC3wDiqifhjjzZiM1HYvcJhgVNvFzbB4Z7K2K6GV FlxVGn/TG9/6zw0KTq50pVnWTnFDMi96DliTDzlOOKV+No0Ji1+Ez50CY9Q3alpzbIsb BLf5lgX6IJ+qdi80fHU8lg8GSB7aVZYkY93NfalIoxT0OpbAd9c5VN2Y9mk2ZirG0HWP lhGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:mime-version; bh=Gzn3uL4rVVdIQDAl4BqctHY3PBm5IhFX7n3JXbdsza8=; b=zUJeySmM6reT9sGXs+fD/3hwdIb3p+Q7Luo+kn/k0HxHUelcUQml0I7CL+H79mOKOR 17cJdRg3EDV7vhrTsnwjLAGHrx9abgzqQ9odbj5pXx0JTAN29qov8Op8auQBSLgNkHrM 6SNelHLLH9xzNOVlAHzwzwSvJzNNsAyvMQLLl2xDFpFu2blwmQmky0Tov5KssCBDzjQu fTLphURPRSaKDXSxNWJcn3gS+wh4gbkVZr2BcRMt//8Lqxyopj/OazRlHtY35opCBP3C sZScoYnQ9hgcBfm3+Hx2pAun93FsGGUGTjlMRSLXZRMep4BFUBMAjqEZIXB9ntQa79+H tc+w== X-Gm-Message-State: AOAM531XY27yduRiFz9aF/Su3CA99UudRKY3iKJYCEqOSTGJCD32PWGF BSfkDQJ+VH+c5FuRZtU1N7R8fw== X-Received: by 2002:a17:90b:1b0e:: with SMTP id nu14mr3174732pjb.39.1643050137402; Mon, 24 Jan 2022 10:48:57 -0800 (PST) Received: from [2620:15c:29:204:1f99:bd65:fcbb:146c] ([2620:15c:29:204:1f99:bd65:fcbb:146c]) by smtp.gmail.com with ESMTPSA id 20sm12251408pgz.59.2022.01.24.10.48.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jan 2022 10:48:56 -0800 (PST) Date: Mon, 24 Jan 2022 10:48:55 -0800 (PST) From: David Rientjes To: Peter Xu , Zach O'Keefe , SeongJae Park cc: Shakeel Butt , David Hildenbrand , "Kirill A . Shutemov" , Yang Shi , Zi Yan , Matthew Wilcox , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm: split thp synchronously on MADV_DONTNEED In-Reply-To: Message-ID: References: <20211120201230.920082-1-shakeelb@google.com> <25b36a5c-5bbd-5423-0c67-05cd6c1432a7@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 26 Nov 2021, Peter Xu wrote: > Some side notes: I digged out the old MADV_COLLAPSE proposal right after I > thought about MADV_SPLIT (or any of its variance): > > https://lore.kernel.org/all/d098c392-273a-36a4-1a29-59731cdf5d3d@google.com/ > > My memory was that there's some issue to be solved so that was blocked, however > when I read the thread it sounds like the list was mostly reaching a consensus > on considering MADV_COLLAPSE being beneficial. Still copying DavidR in case I > missed something important. > > If we think MADV_COLLAPSE can help to implement an userspace (and more > importantly, data-aware) khugepaged, then MADV_SPLIT can be the other side of > kcompactd, perhaps. > > That's probably a bit off topic of this specific discussion on the specific use > case, but so far it seems all reasonable and discussable. > Hi Peter, Providing a (late) update since we now have some better traction on this, I think we'll be ready to post an RFC soon that introduces MADV_COLLAPSE. The work is being driven by Zach, now cc'd. Let's also include SeongJae Park as well and keep him in the loop since DAMON could easily be extended with a DAMOS_COLLAPSE action to use MADV_COLLAPSE for hot regions of memory. Idea for initial approach: - MADV_COLLAPSE core code based on the proposal you cite above for anon memory as the inaugural support, collapse memory into thp in process context - Batching support to collapse ranges of memory into multiple THP - Wire this up for madvise(2) (and process_madvise(2)) - Enlightenment for file-backed thp I think Zach's RFC will cover the first three, it could be debated if the initial patch series *must* support file-backed thp. We'll see based on the feedback to the RFC. There's also an extension where MADV_COLLAPSE could be potentially useful for hugetlb backed memory. We have another effort underway that we've been talking with Mike Kravetz about that allows hugetlb memory to be mapped at multiple levels of the page tables. There are several use cases but one of the driving factors is the performance of post-copy live migration; in this case, you'd be able to send smaller sized pages over the wire rather than, say, a 1GB gigantic page. In this case, MADV_COLLAPSE could be useful to map smaller pages by a larger page table entry before all of the smaller pages have been live migrated. That said, we have not invested time into an MADV_SPLIT yet. Do you (or anybody else) have concerns about this approach? Ideas for extensions? Thanks!