Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp31796409rwd; Fri, 7 Jul 2023 04:44:23 -0700 (PDT) X-Google-Smtp-Source: APBJJlEkmYZYbqlHSH/SRJqkKqj8z/0vAD/TwpWNSlmlmjL676osHWGz3ukW6KL2rQWLMKHXxLUm X-Received: by 2002:a05:6a20:7d94:b0:11e:f740:b988 with SMTP id v20-20020a056a207d9400b0011ef740b988mr4468774pzj.59.1688730263011; Fri, 07 Jul 2023 04:44:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688730262; cv=none; d=google.com; s=arc-20160816; b=d2k9js1bYytc7ZLFAhdx2X1X5mv98aK1CQPVt2au07ABFsFNww9MymjK8bbnCJzJX+ kdAKNX9qlgbgnCJqI4+Cyqg9RCQfmi8L5GrxNlVY6RpbqVud1R5kVTcxo1ndWiOAI3W9 ZGe7TaE3Zn6rnfXsINZJZRvj1RYzc/0AV+QkiA/+NZaW0ntCG/PCj74vmoK4WNMDI+mT Uop7/iRjOytYKTuivaRlU/rGH2SMxoi0FD2JMz23Lf3KqK0u9oJIurRF5ccnSa56OGA2 9x6V5VQaqc+JBSC0r9t4CXBttEZzL57TD+2F0AockNZg836zKhmypva5xhv8vfkO15Mm tNeg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :organization:from:content-language:references:cc:to:user-agent :mime-version:date:message-id:dkim-signature; bh=VWacR69Uz+QfEDSGYVxHfDaVc9q9XWEvvJ77b0V+Ztw=; fh=306kYjWXc5s6B2sFHPEcNVpyg74VQYwkk/6Ny0+1dIc=; b=YjJWVro41zs9j9UdZLUCFbNoV/M6Mes46bkXfOBf4dui/bEX0fbXR+63vmw17S2/Z2 u6HBweauP2u6yx7HW+RxWsWEEcr0IXF2QKb8PTzzOm5KkiaMi8N7XBy2cYnxPkycu2ZH Ckny99UGE/FNTmu3WqEqNbB3DYnU9ytuyhBh0HBzOa4fmcQo6LwBhiaWZodUQVOFYdfl DUx7UqPOhtWH+TAqjYpVQ9ZALCo4Ws27dqDpoWb2CZM45uQf/dM3HUvNm5NZZ6Mvn/YC fsaV+reKnT01f6Byj4+pLZpRpkuNUKTNlz3IJD9WBZkIj3PGGQ8Tf9WAWyPmXIiZGVbD 0Svw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=adrjJcAu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u1-20020a170902e80100b001b8698d619asi3840688plg.602.2023.07.07.04.44.10; Fri, 07 Jul 2023 04:44:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=adrjJcAu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231327AbjGGLaT (ORCPT + 99 others); Fri, 7 Jul 2023 07:30:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231950AbjGGL3z (ORCPT ); Fri, 7 Jul 2023 07:29:55 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A4F421FFF for ; Fri, 7 Jul 2023 04:29:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1688729346; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VWacR69Uz+QfEDSGYVxHfDaVc9q9XWEvvJ77b0V+Ztw=; b=adrjJcAud9uNYthKWg3ALQN6PU5ONSIH2LX79vWSRDJXUW5M/N/4UZ5GuMXRNTuJ9BFbFY 0k6eyo4nch5gS/LdCGUodEotG7ERwzT26MD3B9pYefAClzJ8BjDRbm9himwgeHMpOP5uuc Mwfq5FReDCA9V4LpA/aT05geujHBIF0= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-494-7hGt964MN7edU5muF223AA-1; Fri, 07 Jul 2023 07:29:05 -0400 X-MC-Unique: 7hGt964MN7edU5muF223AA-1 Received: by mail-wr1-f72.google.com with SMTP id ffacd0b85a97d-313ec030acbso1446562f8f.0 for ; Fri, 07 Jul 2023 04:29:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688729344; x=1691321344; h=content-transfer-encoding:in-reply-to:subject:organization:from :content-language:references:cc:to:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=VWacR69Uz+QfEDSGYVxHfDaVc9q9XWEvvJ77b0V+Ztw=; b=YCh/iXhkjpq77iY5J/tB1JG+UYI32lSPF+ABlYZKtTNs+J2iPzLdHn5bZ2iQOp63aD uf/yGPy86oHbHCdFbAIrVmIQNXfr4ELXAeOtjZQuOwtYY0QzXjb0+vZtUgey5LoF6Xvr 58ipKxQLFbKSnpL1F5BoRnugfgZtey6zY3Tk39hgmesZO2LAMfNUHMCgAxct0o5ovQI6 w1oEp+KDl6ZRoigCqllcZNKOLIjr54gOme4FAz+Z+YkFT+f6wQ0ZF1lAF1qjQf8Vzh7G VSXfLxFMxukzEIo9QAanwIaIRCnL8mknOn47vNtwd20/mDsUSLIoghfpTZg/o6hDZvXj 2Rrw== X-Gm-Message-State: ABy/qLao1P9z9IwUygdQhFzt6+QZ6SiU1NxepM7QT+zzv9/J/yJiOPyO /axGBTeePL+vc0mNwg9whufaB3aHk3Cn2maRYtiOY69Uj87lmP5NvNAxttWRGqacCCQiccK9qg3 no5gNBNKrX7a51ivR7SOICzXq X-Received: by 2002:adf:e74b:0:b0:313:f3c0:62d8 with SMTP id c11-20020adfe74b000000b00313f3c062d8mr4764578wrn.21.1688729344272; Fri, 07 Jul 2023 04:29:04 -0700 (PDT) X-Received: by 2002:adf:e74b:0:b0:313:f3c0:62d8 with SMTP id c11-20020adfe74b000000b00313f3c062d8mr4764551wrn.21.1688729343861; Fri, 07 Jul 2023 04:29:03 -0700 (PDT) Received: from ?IPV6:2003:d8:2f04:3c00:248f:bf5b:b03e:aac7? (p200300d82f043c00248fbf5bb03eaac7.dip0.t-ipconnect.de. [2003:d8:2f04:3c00:248f:bf5b:b03e:aac7]) by smtp.gmail.com with ESMTPSA id f14-20020adff44e000000b003142ea7a661sm4246965wrp.21.2023.07.07.04.29.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 07 Jul 2023 04:29:03 -0700 (PDT) Message-ID: <524bacd2-4a47-2b8b-6685-c46e31a01631@redhat.com> Date: Fri, 7 Jul 2023 13:29:02 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 To: Ryan Roberts , "Huang, Ying" Cc: Andrew Morton , Matthew Wilcox , "Kirill A. Shutemov" , Yin Fengwei , Yu Zhao , Catalin Marinas , Will Deacon , Anshuman Khandual , Yang Shi , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20230703135330.1865927-1-ryan.roberts@arm.com> <20230703135330.1865927-5-ryan.roberts@arm.com> <87edlkgnfa.fsf@yhuang6-desk2.ccr.corp.intel.com> <44e60630-5e9d-c8df-ab79-cb0767de680e@arm.com> Content-Language: en-US From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v2 4/5] mm: FLEXIBLE_THP for improved performance In-Reply-To: <44e60630-5e9d-c8df-ab79-cb0767de680e@arm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07.07.23 11:52, Ryan Roberts wrote: > On 07/07/2023 09:01, Huang, Ying wrote: >> Ryan Roberts writes: >> >>> Introduce FLEXIBLE_THP feature, which allows anonymous memory to be >>> allocated in large folios of a specified order. All pages of the large >>> folio are pte-mapped during the same page fault, significantly reducing >>> the number of page faults. The number of per-page operations (e.g. ref >>> counting, rmap management lru list management) are also significantly >>> reduced since those ops now become per-folio. >> >> I likes the idea to share as much code as possible between large >> (anonymous) folio and THP. Finally, THP becomes just a special kind of >> large folio. >> >> Although we can use smaller page order for FLEXIBLE_THP, it's hard to >> avoid internal fragmentation completely. So, I think that finally we >> will need to provide a mechanism for the users to opt out, e.g., >> something like "always madvise never" via >> /sys/kernel/mm/transparent_hugepage/enabled. I'm not sure whether it's >> a good idea to reuse the existing interface of THP. > > I wouldn't want to tie this to the existing interface, simply because that > implies that we would want to follow the "always" and "madvise" advice too; That > means that on a thp=madvise system (which is certainly the case for android and > other client systems) we would have to disable large anon folios for VMAs that > haven't explicitly opted in. That breaks the intention that this should be an > invisible performance boost. I think it's important to set the policy for use of It will never ever be a completely invisible performance boost, just like ordinary THP. Using the exact same existing toggle is the right thing to do. If someone specify "never" or "madvise", then do exactly that. It might make sense to have more modes or additional toggles, but "madvise=never" means no memory waste. I remember I raised it already in the past, but you *absolutely* have to respect the MADV_NOHUGEPAGE flag. There is user space out there (for example, userfaultfd) that doesn't want the kernel to populate any additional page tables. So if you have to respect that already, then also respect MADV_HUGEPAGE, simple. > THP separately to use of large anon folios. > > I could be persuaded on the merrits of a new runtime enable/disable interface if > there is concensus. There would have to be very good reason for a completely separate control. Bypassing MADV_NOHUGEPAGE or "madvise=never" simply because we add a "flexible" before the THP sounds broken. -- Cheers, David / dhildenb