Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp4718581rdb; Tue, 12 Dec 2023 07:32:40 -0800 (PST) X-Google-Smtp-Source: AGHT+IEa/C3/Jhj8V54bChTLLojXUZfEpDYyXcdFGkThef3tUMW5DSN1dIO2Tc8cweqTH0kZCOWr X-Received: by 2002:a05:6358:7e92:b0:170:955b:57c2 with SMTP id o18-20020a0563587e9200b00170955b57c2mr7470462rwn.33.1702395160418; Tue, 12 Dec 2023 07:32:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702395160; cv=none; d=google.com; s=arc-20160816; b=08zLw0wtj7g0uNjLvp2U54LuRbnu0Fjqs2xrLyDk5ENSXEtf6pezRMoRvAbsgVXwx+ AqzQ3dKGSuSj5B5K34ITUx0B64OTX4ZGtRG5i49QP8mwkPdWGONtFT3HuM2cGtSoX400 1hxKSKf3YPV94PRIH2y04I2KcAds1c7P+7uMpxC7OaRaVe7PWSBQ48ZFHBDW9OhYOfAc dCm7lXYaP0gXqDzX9Tl3MvhVe7MNw/ziutnq2bgO+ayCT9q9cNzPqh2YqiwvI9Nq1NGS X/3EqPGnxxZaGTX+Or3Mav8B7KAVEL1+Azob+yMlB2FEL8dhy4j+9CzAnO3hPPg46THl IbXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=/JT3nkY0H/SSjI3JKkGKF0lUbJqq7C9Gl5r3saOygHI=; fh=Yvxy7o/9eHNNAPxkK9NeXfhnZ69ck9qOTqCSxJ2hGyo=; b=wGBzATQZvX0WbCZeQzy85oVt59Px2Iqzr/lu9YZKO282Y/aTDKSEZbSHayHRAcoFzu fgN3ls5YHQWygiMP8DGagL/vmA/xS3v4pGaMDRcbuUQ7KwN6PLP+3LWPrws5/O0ZmHPU 1O9P/gYKHpp73icybFm94CXOcuz6jiPJanPDQg1pwZy4viifiAyrmOjSMTCnkWKszjhU 5JOjrgbGJ+HC9QQjimz9qIOiryBYOC922fIIccKeNGaOjr9CG0MN/ynqKbSRboYkTH/h bF0fs/FNkMI5TFwmTNWj31l2+DcQRIm0prX6R98SJlMDkz1w0oEpVoSv4U229FqW/w48 6R7g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id b7-20020a63eb47000000b005c65fcca231si7807141pgk.18.2023.12.12.07.32.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 07:32:40 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 019E78088685; Tue, 12 Dec 2023 07:32:39 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1376875AbjLLPc2 (ORCPT + 99 others); Tue, 12 Dec 2023 10:32:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42914 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1376793AbjLLPc1 (ORCPT ); Tue, 12 Dec 2023 10:32:27 -0500 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 9B04995 for ; Tue, 12 Dec 2023 07:32:33 -0800 (PST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A394A143D; Tue, 12 Dec 2023 07:33:19 -0800 (PST) Received: from [10.1.39.183] (XHFQ2J9959.cambridge.arm.com [10.1.39.183]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8385B3F738; Tue, 12 Dec 2023 07:32:30 -0800 (PST) Message-ID: Date: Tue, 12 Dec 2023 15:32:29 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v9 03/10] mm: thp: Introduce multi-size THP sysfs interface Content-Language: en-GB To: David Hildenbrand , Andrew Morton , Matthew Wilcox , Yin Fengwei , Yu Zhao , Catalin Marinas , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama , "Kirill A. Shutemov" , John Hubbard , David Rientjes , Vlastimil Babka , Hugh Dickins , Kefeng Wang , Barry Song <21cnbao@gmail.com>, Alistair Popple Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Barry Song References: <20231207161211.2374093-1-ryan.roberts@arm.com> <20231207161211.2374093-4-ryan.roberts@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 12 Dec 2023 07:32:39 -0800 (PST) On 12/12/2023 14:54, David Hildenbrand wrote: > On 07.12.23 17:12, Ryan Roberts wrote: >> In preparation for adding support for anonymous multi-size THP, >> introduce new sysfs structure that will be used to control the new >> behaviours. A new directory is added under transparent_hugepage for each >> supported THP size, and contains an `enabled` file, which can be set to >> "inherit" (to inherit the global setting), "always", "madvise" or >> "never". For now, the kernel still only supports PMD-sized anonymous >> THP, so only 1 directory is populated. >> >> The first half of the change converts transhuge_vma_suitable() and >> hugepage_vma_check() so that they take a bitfield of orders for which >> the user wants to determine support, and the functions filter out all >> the orders that can't be supported, given the current sysfs >> configuration and the VMA dimensions. The resulting functions are >> renamed to thp_vma_suitable_orders() and thp_vma_allowable_orders() >> respectively. Convenience functions that take a single, unencoded order >> and return a boolean are also defined as thp_vma_suitable_order() and >> thp_vma_allowable_order(). >> >> The second half of the change implements the new sysfs interface. It has >> been done so that each supported THP size has a `struct thpsize`, which >> describes the relevant metadata and is itself a kobject. This is pretty >> minimal for now, but should make it easy to add new per-thpsize files to >> the interface if needed in future (e.g. per-size defrag). Rather than >> keep the `enabled` state directly in the struct thpsize, I've elected to >> directly encode it into huge_anon_orders_[always|madvise|inherit] >> bitfields since this reduces the amount of work required in >> thp_vma_allowable_orders() which is called for every page fault. >> >> See Documentation/admin-guide/mm/transhuge.rst, as modified by this >> commit, for details of how the new sysfs interface works. >> >> Reviewed-by: Barry Song >> Tested-by: Kefeng Wang >> Tested-by: John Hubbard >> Signed-off-by: Ryan Roberts >> --- > > [...] > >> + >> +static ssize_t thpsize_enabled_store(struct kobject *kobj, >> +                     struct kobj_attribute *attr, >> +                     const char *buf, size_t count) >> +{ >> +    int order = to_thpsize(kobj)->order; >> +    ssize_t ret = count; >> + >> +    if (sysfs_streq(buf, "always")) { >> +        spin_lock(&huge_anon_orders_lock); >> +        clear_bit(order, &huge_anon_orders_inherit); >> +        clear_bit(order, &huge_anon_orders_madvise); >> +        set_bit(order, &huge_anon_orders_always); >> +        spin_unlock(&huge_anon_orders_lock); >> +    } else if (sysfs_streq(buf, "inherit")) { >> +        spin_lock(&huge_anon_orders_lock); >> +        clear_bit(order, &huge_anon_orders_always); >> +        clear_bit(order, &huge_anon_orders_madvise); >> +        set_bit(order, &huge_anon_orders_inherit); >> +        spin_unlock(&huge_anon_orders_lock); >> +    } else if (sysfs_streq(buf, "madvise")) { >> +        spin_lock(&huge_anon_orders_lock); >> +        clear_bit(order, &huge_anon_orders_always); >> +        clear_bit(order, &huge_anon_orders_inherit); >> +        set_bit(order, &huge_anon_orders_madvise); >> +        spin_unlock(&huge_anon_orders_lock); >> +    } else if (sysfs_streq(buf, "never")) { >> +        spin_lock(&huge_anon_orders_lock); >> +        clear_bit(order, &huge_anon_orders_always); >> +        clear_bit(order, &huge_anon_orders_inherit); >> +        clear_bit(order, &huge_anon_orders_madvise); >> +        spin_unlock(&huge_anon_orders_lock); > > Why not perform lock/unlock only once in surrounding code? :) I was nervous that sysfs_streq() may be unhappy in atomic context... Unfounded? > > > Much better > > Acked-by: David Hildenbrand >