Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp3680393rdb; Wed, 13 Sep 2023 22:01:06 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGER+25HcXdBM/wI3udEre6s/4l76hl38hmi9N2hvSxkXrBsarJ9+6hY3arnf+zhfGDROy4 X-Received: by 2002:a67:fc53:0:b0:450:bd08:e168 with SMTP id p19-20020a67fc53000000b00450bd08e168mr4119254vsq.13.1694667666619; Wed, 13 Sep 2023 22:01:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694667666; cv=none; d=google.com; s=arc-20160816; b=ov0CLGH9OpY536h58iMNOc1U+TjhNu9FzLDgtFH/XNS1jfMRidd4XSyJwSCwydKGle /ApZ1oZQ6yH9YivfN9+HskZh421TTa+tfxl63/t+DkHWb0M0ZMN4jERZKefEpaOzp/d0 F1HBTAMwkYmP/U8rlFlS42+7NUikn20gZf2IdiLrW/O+8OHIjmbtYgRl/XViWfCKsaEp bObhTnLuZ8AH7aj6KnxM5XGNKLCqAfLLbA4bSUxCAHcJ6qXer3ItTnGJ3U/C/dzxuZHX m+Zu4Wz6HWkJ/vl9iXYVMItWhO2jzDJr4smIbwsx97tdveyo5wEtSCUYLKtscW8wNCc/ 4Vhg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature:dkim-signature; bh=1GH7qir1qQHMIG1UYyOw9wdZ53FxihsyGr7TX5dB0SQ=; fh=hq15/A2Tslw9YA2204Su5Ecu43xpuzUCtL6/M8x+dIs=; b=NT1vbIH1b+EeeHsopfaqciD7qjIpca0nxyVVvh9/YwLY7ES5SPTSLqwej4SPNPJKgo xEgyL0ozCqbxKpyDuHohSFgKI08aRsc4tJUaJ7fqp+qXI4XU/w6PtrkUvPycJCsSv8Kw yy/fGrCEnB8r+FXagFXMeLQFg+DZL35WcnOkzoCeU76rhrqffs2w9RYAPzX9OdFaXXR1 bHmTlahT1LyIZahrameF2xMM3V0AgO3XTqbwpOhZIW4xv+qSI02S8t9YP9SvM4LFI6/N 6rHzChr+QW+e3xq6psSavrc5XZloKYt6fkJ4xudrLa/NJnT7eSttkGH3fg/qobrvqXUk aBSQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=ZtJT7RhZ; dkim=neutral (no key) header.i=@suse.cz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id n67-20020a632746000000b00573fbbf187dsi734615pgn.216.2023.09.13.22.01.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 22:01:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=ZtJT7RhZ; dkim=neutral (no key) header.i=@suse.cz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 5A28E8046BDB; Wed, 13 Sep 2023 13:18:28 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232167AbjIMUSY (ORCPT + 99 others); Wed, 13 Sep 2023 16:18:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46338 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229527AbjIMUSX (ORCPT ); Wed, 13 Sep 2023 16:18:23 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4D0F71BC8 for ; Wed, 13 Sep 2023 13:18:19 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 0A27B1F385; Wed, 13 Sep 2023 20:18:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1694636298; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1GH7qir1qQHMIG1UYyOw9wdZ53FxihsyGr7TX5dB0SQ=; b=ZtJT7RhZ9QWAENDYt4jjtGJSXl/JuY5YfiJQw5uewVi1Goub4//cijlYkbqu4KNS5HD/Ek sVImPfGvXyKhUe1zh6LOInoHJ0WhMTwBubCXM6VSrk2EMyQJN2qNOT86L4hifuVUGK1MTa 7ZM8FX1rm1eEyJlUr0XKcNL82VGORnc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1694636298; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1GH7qir1qQHMIG1UYyOw9wdZ53FxihsyGr7TX5dB0SQ=; b=oQAxaEvcxaHxjPOpP4YWBhv0CpIKFEvgrHpJfCmRxC+va5Xt9lL+4cYRBT3t6zQP6Cc3Dd WrRw8N9atRseC2Bg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id D9C7113440; Wed, 13 Sep 2023 20:18:17 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 17YZNAkZAmWHAgAAMHmgww (envelope-from ); Wed, 13 Sep 2023 20:18:17 +0000 Message-ID: <37dbd4d0-c125-6694-dec4-6322ae5b6dee@suse.cz> Date: Wed, 13 Sep 2023 22:18:17 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: Re: [PATCH 6/6] mm: page_alloc: consolidate free page accounting Content-Language: en-US To: Johannes Weiner , Andrew Morton Cc: Mel Gorman , Miaohe Lin , Kefeng Wang , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20230911195023.247694-1-hannes@cmpxchg.org> <20230911195023.247694-7-hannes@cmpxchg.org> From: Vlastimil Babka In-Reply-To: <20230911195023.247694-7-hannes@cmpxchg.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Wed, 13 Sep 2023 13:18:28 -0700 (PDT) X-Spam-Status: No, score=-2.3 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email On 9/11/23 21:41, Johannes Weiner wrote: > Free page accounting currently happens a bit too high up the call > stack, where it has to deal with guard pages, compaction capturing, > block stealing and even page isolation. This is subtle and fragile, > and makes it difficult to hack on the code. > > Now that type violations on the freelists have been fixed, push the > accounting down to where pages enter and leave the freelist. > > v3: > - fix CONFIG_UNACCEPTED_MEMORY build (lkp) > v2: > - fix CONFIG_DEBUG_PAGEALLOC build (Mel) > > Signed-off-by: Johannes Weiner > > /* Used for pages not on another list */ > -static inline void add_to_free_list_tail(struct page *page, struct zone *zone, > - unsigned int order, int migratetype) > +static inline void add_to_free_list(struct page *page, struct zone *zone, > + unsigned int order, int migratetype, > + bool tail) > { > struct free_area *area = &zone->free_area[order]; > > - list_add_tail(&page->buddy_list, &area->free_list[migratetype]); > + VM_WARN_ONCE(get_pageblock_migratetype(page) != migratetype, > + "page type is %lu, passed migratetype is %d (nr=%d)\n", > + get_pageblock_migratetype(page), migratetype, 1 << order); Ok, IIUC so you now assume pageblock migratetype is now matching freelist placement at all times. This is a change from the previous treatment as a heuristic that may be sometimes imprecise. Let's assume the previous patches handled the deterministic reasons why those would deviate (modulo my concern about pageblocks spanning multiple zones in reply to 5/6). But unless I'm missing something, I don't think the possible race scenarios were dealt with? Pageblock migratetype is set under zone->lock but there are places that read it outside of zone->lock and then trust it to perform the freelist placement. See for example __free_pages_ok(), or free_unref_page() in the cases it calls free_one_page(). These determine pageblock migratetype before taking the zone->lock. Only for has_isolate_pageblock() cases we are more careful, because previously isolation was the only case where precision was needed. So I think this warning is going to trigger? > + > + if (tail) > + list_add_tail(&page->buddy_list, &area->free_list[migratetype]); > + else > + list_add(&page->buddy_list, &area->free_list[migratetype]); > area->nr_free++; > + > + account_freepages(page, zone, 1 << order, migratetype); > } > > /* > @@ -757,23 +783,21 @@ static inline void __free_one_page(struct page *page, > VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP, page); > > VM_BUG_ON(migratetype == -1); > - if (likely(!is_migrate_isolate(migratetype))) > - __mod_zone_freepage_state(zone, 1 << order, migratetype); > - > VM_BUG_ON_PAGE(pfn & ((1 << order) - 1), page); > VM_BUG_ON_PAGE(bad_range(zone, page), page); > > while (order < MAX_ORDER) { > - if (compaction_capture(capc, page, order, migratetype)) { > - __mod_zone_freepage_state(zone, -(1 << order), > - migratetype); > + int buddy_mt; > + > + if (compaction_capture(capc, page, order, migratetype)) > return; > - } > > buddy = find_buddy_page_pfn(page, pfn, order, &buddy_pfn); > if (!buddy) > goto done_merging; > > + buddy_mt = get_pfnblock_migratetype(buddy, buddy_pfn); You should assume buddy_mt equals migratetype, no? It's the same assumption as the VM_WARN_ONCE() I've discussed? > + > if (unlikely(order >= pageblock_order)) { Only here buddy_mt can differ and the code in this block already handles that. > /* > * We want to prevent merge between freepages on pageblock > @@ -801,9 +825,9 @@ static inline void __free_one_page(struct page *page, > * merge with it and move up one order. > */ > if (page_is_guard(buddy)) > - clear_page_guard(zone, buddy, order, migratetype); > + clear_page_guard(zone, buddy, order); > else > - del_page_from_free_list(buddy, zone, order); > + del_page_from_free_list(buddy, zone, order, buddy_mt); Ugh so this will add account_freepages() call to each iteration of the __free_one_page() hot loop, which seems like a lot of unnecessary overhead - as long as we are within pageblock_order the migratetype should be the same, and thus also is_migrate_isolate() and is_migrate_cma() tests should return the same value so we shouldn't need to call __mod_zone_page_state() piecemeal like this. > combined_pfn = buddy_pfn & pfn; > page = page + (combined_pfn - pfn); > pfn = combined_pfn;