Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp35180351rwd; Mon, 10 Jul 2023 03:56:49 -0700 (PDT) X-Google-Smtp-Source: APBJJlGU6eztdduTh0ngyOstYoF2U+TmoGX5lpAKA9PGqKSoxPjtm/XCdie3rvPyp562j575my4E X-Received: by 2002:a19:5e55:0:b0:4fb:78a0:dd34 with SMTP id z21-20020a195e55000000b004fb78a0dd34mr8853444lfi.42.1688986609069; Mon, 10 Jul 2023 03:56:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688986609; cv=none; d=google.com; s=arc-20160816; b=nELmL9CfB1OI0UjYXCbQw6uUhuF9vOe3O2xno7Z0kHMguUFWTEpIYX8uhDqDV+jqPF w4LdiIjdUxomJWhe4IbtklD4lF1JpOtL8AB3rm6X4vxjnzxTZvvjw8+a4Kv2o33A9GpU Opj50e9efbrfId0NHh2wMCtEAodZH3yC+Y3Si2jj653MAkSLREcUxAoqvtH8bfrF171t 7Xqf6ZD/osW6QQYDU4WZx0+xqLCYAivKU13wAMwRzqBtZfhhNMd5exW831AqZ3PStfcq VgrnWQ62ZCbPxyDcWFhefO3wa1c+IhaMqNqFsr+GlVE5SLHztwyFr7+XQu1w3SXRfTGO Pvkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id; bh=Yzw4PYVADSOGEsQLiCWVZj7OhQYT8FUxHSMJwFBjflk=; fh=theTUsFFAD27aci54yT7qhvrtuIzbTwzeS8moYNKYcQ=; b=RQYeq2Lgr61J7bZw6YOhxSm5GaxZQ2npzj67j7uSBTkCfUxiUmYUFu9/lBEOnqoV7A Z7ilIJ9DmFR6eLqaiJlYX2uhrElsPB9n40INP0xTwBjhoRp9WdGNc+dvjY90LU8KIZbe xzkmD3vfzXGVbfglLgPavkhaRcQL+IpzK5o4PPLJK5j/9n9M5uqW/rsJC0Y2SBbwkBAr OqSudTPxR69Fvz+n0c28EH5ELEpwNGBZlT28fqnWdut0cTnPO7w/vP2UFGIFi437yOj6 erd/Jy371N4VHWm6DgUQZjZ0Ul/PxBXL1CT1WCBB488tu2O2s0OIqN7HLG1/s32sOJ92 NpSA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c20-20020aa7df14000000b0051dd22715c4si8621010edy.196.2023.07.10.03.56.25; Mon, 10 Jul 2023 03:56:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230313AbjGJKgv (ORCPT + 99 others); Mon, 10 Jul 2023 06:36:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44770 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229569AbjGJKgu (ORCPT ); Mon, 10 Jul 2023 06:36:50 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0823AAF for ; Mon, 10 Jul 2023 03:36:49 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EB2AE2B; Mon, 10 Jul 2023 03:37:30 -0700 (PDT) Received: from [10.57.77.63] (unknown [10.57.77.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 660A03F740; Mon, 10 Jul 2023 03:36:46 -0700 (PDT) Message-ID: Date: Mon, 10 Jul 2023 11:36:44 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [RFC PATCH 0/3] support large folio for mlock To: Matthew Wilcox , David Hildenbrand Cc: Yin Fengwei , linux-mm@kvack.org, linux-kernel@vger.kernel.org, yuzhao@google.com, shy828301@gmail.com, akpm@linux-foundation.org References: <20230707165221.4076590-1-fengwei.yin@intel.com> <4bb39d6e-a324-0d85-7d44-8e8a37a1cfec@redhat.com> <5c9bf622-0866-168f-a1cd-4e4a98322127@redhat.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/07/2023 20:26, Matthew Wilcox wrote: > On Fri, Jul 07, 2023 at 09:15:02PM +0200, David Hildenbrand wrote: >>>> Sure, any time we PTE-map a THP we might just say "let's put that on the >>>> deferred split queue" and cross fingers that we can eventually split it >>>> later. (I was recently thinking about that in the context of the mapcount >>>> ...) >>>> >>>> It's all a big mess ... >>> >>> Oh, I agree, there are always going to be circumstances where we realise >>> we've made a bad decision and can't (easily) undo it. Unless we have a >>> per-page pincount, and I Would Rather Not Do That. >> >> I agree ... >> >> But we should _try_ >>> to do that because it's the right model -- that's what I meant by "Tell >> >> Try to have per-page pincounts? :/ or do you mean, try to split on VMA >> split? I hope the latter (although I'm not sure about performance) :) > > Sorry, try to split a folio on VMA split. > >>> me why I'm wrong"; what scenarios do we have where a user temporarilly >>> mlocks (or mprotects or ...) a range of memory, but wants that memory >>> to be aged in the LRU exactly the same way as the adjacent memory that >>> wasn't mprotected? >> >> Let me throw in a "fun one". >> >> Parent process has a 2 MiB range populated by a THP. fork() a child process. >> Child process mprotects half the VMA. >> >> Should we split the (COW-shared) THP? Or should we COW/unshare in the child >> process (ugh!) during the VMA split. >> >> It all makes my brain hurt. > > OK, so this goes back to what I wrote earlier about attempting to choose > what size of folio to allocate on COW: > > https://lore.kernel.org/linux-mm/Y%2FU8bQd15aUO97vS@casper.infradead.org/ > > : the parent had already established > : an appropriate size folio to use for this VMA before calling fork(). > : Whether it is the parent or the child causing the COW, it should probably > : inherit that choice and we should default to the same size folio that > : was already found. FWIW, I had patches in my original RFC that aimed to follow this policy for large anon folios [1] & [2], and intend to follow up with a modified version of these patches once we have an initial submission. [1] https://lore.kernel.org/linux-mm/20230414130303.2345383-11-ryan.roberts@arm.com/ [2] https://lore.kernel.org/linux-mm/20230414130303.2345383-15-ryan.roberts@arm.com/