Received: by 2002:a05:6358:45e:b0:b5:b6eb:e1f9 with SMTP id 30csp169565rwe; Fri, 26 Aug 2022 02:55:41 -0700 (PDT) X-Google-Smtp-Source: AA6agR7o1jKVusYuuozLv9Z38/8eObU0zvqzmjpBrb1ni+zn1xcScJzzzWWnPdfV9jlJp1mDK3fQ X-Received: by 2002:a17:902:d4ca:b0:16f:8311:54b0 with SMTP id o10-20020a170902d4ca00b0016f831154b0mr3103010plg.108.1661507741572; Fri, 26 Aug 2022 02:55:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1661507741; cv=none; d=google.com; s=arc-20160816; b=UWzxWpa65wVgpdSqgVvdxV4xUjVpeejf7AGOsvwzDo0FCS4ylF8FDs1JY39neaBfCm i/U2HDwYZ/Cg6LXgcw6fv7qdViNSagjuK6RU26hG7iGqvzKfrua9liEdqMaPblD5+o49 sykU02ftJQc/RDDX8cnNwaWGJ/r0Z2NOEX9Xer8QVWBFphOyq11EKnAdZEz2YtNlm5Ro taJKzxOHulYRph3rm9h7+XbCPpuDnhB2u8oy+DVkWgN1nSl2gNK0Y9x+sLV6OfO7G+9U IzUZIjVvkVeaytMHsXoUt15zNtD+EngF9SwDqsOt8MEd24Ey7IAtM8Z602e7yF6SZcaX tOLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:dkim-signature; bh=rwTeM/tZwHlF/gCOsr5EHDHRMxzem1aatErUxKrHYPk=; b=oyWtWTExhxtZuRf2Bpr0RXD6/Pig8FHbzZv6iT59KxpoKjbWh2FEgj+XBmMrQLEudW QHMQE7EY5nPpleQraeqO7IAwrmqcOANx0cGwwENj82b2nZGA5dx7JCB7hiRDP1jsSPgM HLEHa+nXG9BCIJSICWI2U3z+S8M0f8R1EN/JVzmsTy8I89GrMEPWgU8l3i8Vwp85lYJP xO0vFCPLGHMvVcj+s9jq0zMrh5zCGUFY6969SNMcJe+XF01RYnqyyHMZ9FvuAHFcDA4T ibGgrJXMLTxvsJtZA9954V+muuKpERwcm8XzqzO31RVwWdcFE9mbv6bCI6Ck5XHD2O/W JYDw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=Zpt+b56B; dkim=neutral (no key) header.i=@suse.cz; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u62-20020a638541000000b0042b1b01b3a3si1258734pgd.471.2022.08.26.02.55.20; Fri, 26 Aug 2022 02:55:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=Zpt+b56B; dkim=neutral (no key) header.i=@suse.cz; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343650AbiHZJwr (ORCPT + 99 others); Fri, 26 Aug 2022 05:52:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58208 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343708AbiHZJwe (ORCPT ); Fri, 26 Aug 2022 05:52:34 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 902F4D8B0B for ; Fri, 26 Aug 2022 02:52:10 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id BCD89338A4; Fri, 26 Aug 2022 09:52:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1661507528; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rwTeM/tZwHlF/gCOsr5EHDHRMxzem1aatErUxKrHYPk=; b=Zpt+b56BXzpYNAZ/BC1kLyNrwKm2wfnpdMQNjy86u8yhwdXlxDkugWcpF8D33hk/EarNwY 1qyZPJ0cu2a5hQo0ww2X9YDgTBwpsx9XD1r4MRpJkK4iWWjRhpAv0BfD9T7rgmWuYsOP4e ASEM7i6rZZUuE/XWPn3eixDFqCY6Yus= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1661507528; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rwTeM/tZwHlF/gCOsr5EHDHRMxzem1aatErUxKrHYPk=; b=KDKmvx6kHkVKE/2/tuzmFvPRIp7K0AoVUWjEV2HMlVQo7ns+LHMZ/TdO2omJ/QwjMbL+7J GArRQzT1oG40DBAg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id ACFDE13421; Fri, 26 Aug 2022 09:52:08 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id wQgyKsiXCGMQSwAAMHmgww (envelope-from ); Fri, 26 Aug 2022 09:52:08 +0000 Received: by quack3.suse.cz (Postfix, from userid 1000) id 39D45A0679; Fri, 26 Aug 2022 11:52:08 +0200 (CEST) Date: Fri, 26 Aug 2022 11:52:08 +0200 From: Jan Kara To: Stefan Wahren Cc: Jan Kara , Ted Tso , linux-ext4@vger.kernel.org, Thorsten Leemhuis , Ojaswin Mujoo , Harshad Shirwadkar Subject: Re: [PATCH 0/2] ext4: Fix performance regression with mballoc Message-ID: <20220826095208.a3xjakfsfhjwo2wa@quack3> References: <20220823134508.27854-1-jack@suse.cz> <8e164532-c436-241f-33be-4b41f7f67235@i2se.com> <20220824104010.4qvw46zmf42te53n@quack3> <743489b4-4f9d-3a4d-d87e-e6bf981027c4@i2se.com> <20220825091842.fybrfgdzd56xi53i@quack3> <596afc6f-4c54-3269-ac84-36bc266cc898@i2se.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <596afc6f-4c54-3269-ac84-36bc266cc898@i2se.com> X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Hi Stefan! On Thu 25-08-22 17:48:32, Stefan Wahren wrote: > Am 25.08.22 um 11:18 schrieb Jan Kara: > > On Wed 24-08-22 23:24:43, Stefan Wahren wrote: > > > Am 24.08.22 um 12:40 schrieb Jan Kara: > > > > On Wed 24-08-22 12:17:14, Stefan Wahren wrote: > > > > > Am 23.08.22 um 22:15 schrieb Jan Kara: > > > > > > So I have implemented mballoc improvements to avoid spreading allocations > > > > > > even with mb_optimize_scan=1. It fixes the performance regression I was able > > > > > > to reproduce with reaim on my test machine: > > > > > > > > > > > > mb_optimize_scan=0 mb_optimize_scan=1 patched > > > > > > Hmean disk-1 2076.12 ( 0.00%) 2099.37 ( 1.12%) 2032.52 ( -2.10%) > > > > > > Hmean disk-41 92481.20 ( 0.00%) 83787.47 * -9.40%* 90308.37 ( -2.35%) > > > > > > Hmean disk-81 155073.39 ( 0.00%) 135527.05 * -12.60%* 154285.71 ( -0.51%) > > > > > > Hmean disk-121 185109.64 ( 0.00%) 166284.93 * -10.17%* 185298.62 ( 0.10%) > > > > > > Hmean disk-161 229890.53 ( 0.00%) 207563.39 * -9.71%* 232883.32 * 1.30%* > > > > > > Hmean disk-201 223333.33 ( 0.00%) 203235.59 * -9.00%* 221446.93 ( -0.84%) > > > > > > Hmean disk-241 235735.25 ( 0.00%) 217705.51 * -7.65%* 239483.27 * 1.59%* > > > > > > Hmean disk-281 266772.15 ( 0.00%) 241132.72 * -9.61%* 263108.62 ( -1.37%) > > > > > > Hmean disk-321 265435.50 ( 0.00%) 245412.84 * -7.54%* 267277.27 ( 0.69%) > > > > > > > > > > > > Stefan, can you please test whether these patches fix the problem for you as > > > > > > well? Comments & review welcome. > > > > > i tested the whole series against 5.19 and 6.0.0-rc2. In both cases the > > > > > update process succeed which is a improvement, but the download + unpack > > > > > duration ( ~ 7 minutes ) is not as good as with mb_optimize_scan=0 ( ~ 1 > > > > > minute ). > > > > OK, thanks for testing! I'll try to check specifically untar whether I can > > > > still see some differences in the IO pattern on my test machine. > > > i made two iostat output logs during the complete download phase with 5.19 > > > and your series applied. iostat was running via ssh connection and > > > rpi-update via serial console. > > > > > > First with mb_optimize_scan=0 > > > > > > https://github.com/lategoodbye/mb_optimize_scan_regress/blob/main/5.19_SDCIT_patch_nooptimize_download_success.iostat.log > > > > > > Second with mb_optimize_scan=1 > > > > > > https://github.com/lategoodbye/mb_optimize_scan_regress/blob/main/5.19_SDCIT_patch_optimize_download_success.iostat.log > > > > > > Maybe this helps > > Thanks for the data! So this is interesting. In both iostat logs, there is > > initial phase where no IO happens. I guess that's expected. It is > > significantly longer in the mb_optimize_scan=0 but I suppose that is just > > caused by a difference in when iostat was actually started. Then in > > mb_optimize_scan=0 there is 155 seconds where the eMMC card is 100% > > utilized and then iostat ends. During this time ~63MB is written > > altogether. Request sizes vary a lot, average is 60KB. > > > > In mb_optimize_scan=1 case there is 715 seconds recorded where eMMC card is > > 100% utilized. During this time ~133MB is written, average request size is > > 40KB. If I look just at first 155 seconds of the trace (assuming iostat was > > in both cases terminated before writing was fully done), we have written > > ~53MB and average request size is 56KB. > > > > So with mb_optimize_scan=1 we are indeed still somewhat slower but based on > > the trace it is not clear why the download+unpack should take 7 minutes > > instead of 1 minute. There must be some other effect we are missing. > > > > Perhaps if you just download the archive manually, call sync(1), and measure > > how long it takes to (untar the archive + sync) in mb_optimize_scan=0/1 we > > can see whether plain untar is indeed making the difference or there's > > something else influencing the result as well (I have checked and > > rpi-update does a lot of other deleting & copying as the part of the > > update)? Thanks. > > I will provide those iostats. > > Btw i untar the firmware archive (mb_optimized_scan=1 and your patch) and > got following: > > cat /proc/fs/ext4/mmcblk1p2/mb_structs_summary > > > optimize_scan: 1 > max_free_order_lists: > ??????? list_order_0_groups: 5 > ??????? list_order_1_groups: 0 > ??????? list_order_2_groups: 0 > ??????? list_order_3_groups: 0 > ??????? list_order_4_groups: 1 > ??????? list_order_5_groups: 0 > ??????? list_order_6_groups: 1 > ??????? list_order_7_groups: 1 > ??????? list_order_8_groups: 10 > ??????? list_order_9_groups: 1 > ??????? list_order_10_groups: 2 > ??????? list_order_11_groups: 0 > ??????? list_order_12_groups: 2 > ??????? list_order_13_groups: 55 > fragment_size_tree: > ??????? tree_min: 1 > ??????? tree_max: 31249 > > ??????? tree_nodes: 79 > > Is this expected? Yes, I don't see anything out of ordinary in this for a used filesystem. It tells us there are 55 groups with big chunks of free space, there are some groups which have only small chunks of free space but that's expected when the filesystem is reasonably filled... Honza -- Jan Kara SUSE Labs, CR