Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp5738081iob; Tue, 10 May 2022 02:27:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJznP0vFc8C8EBCgSmaey75kaN/Y/aBKhNbYqONPKI34bhpXy26kGcM+uH20RaOzf26w6H+X X-Received: by 2002:a17:902:8605:b0:15d:10dc:1c6f with SMTP id f5-20020a170902860500b0015d10dc1c6fmr20178892plo.4.1652174860611; Tue, 10 May 2022 02:27:40 -0700 (PDT) Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z17-20020aa79e51000000b0050a4ae72df4si15380169pfq.361.2022.05.10.02.27.25; Tue, 10 May 2022 02:27:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UKja5lMt; arc=fail (signature failed); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235229AbiEJDrc (ORCPT + 99 others); Mon, 9 May 2022 23:47:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37894 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231928AbiEJDr0 (ORCPT ); Mon, 9 May 2022 23:47:26 -0400 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B7AB6420 for ; Mon, 9 May 2022 20:43:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652154210; x=1683690210; h=message-id:date:subject:to:cc:references:from: in-reply-to:content-transfer-encoding:mime-version; bh=dWW/vIRp02qWnp08nhscfg45qIG7F+uB+Ug+ef40D74=; b=UKja5lMt40p5OVtg6Vz2RX073cs1RERC1e7GxE0nyilD3a1q6Xks7MnD vxUKvCAnCfsQygCAeF1HHoGHKXRH8bqQDvGuJLKGnNuWWb/EIqUR2XnGx /O8MsaHp7O5w0PS//+Yo6byC0LdNZFmUAs2gGUuElf/Gh6mEsb82XNcZR ErENOZNyFm5FhNfD4zJHDcYb2pUMoD54HS6kCG+N2EPTtmukuwSotEpie i4icANKeh4h00SoEdx19qvlxBdN3UVeLrxgX178ztuIuDNvWLnMq1i0Ce 7qU/JoE+gFZ7LEESAupZ7lIfuJzhjn45K6b4POWr0l6f95T3tU+9ePfJV w==; X-IronPort-AV: E=McAfee;i="6400,9594,10342"; a="269386031" X-IronPort-AV: E=Sophos;i="5.91,213,1647327600"; d="scan'208";a="269386031" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 20:43:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,213,1647327600"; d="scan'208";a="565429734" Received: from fmsmsx605.amr.corp.intel.com ([10.18.126.85]) by orsmga007.jf.intel.com with ESMTP; 09 May 2022 20:43:29 -0700 Received: from fmsmsx606.amr.corp.intel.com (10.18.126.86) by fmsmsx605.amr.corp.intel.com (10.18.126.85) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.27; Mon, 9 May 2022 20:43:28 -0700 Received: from fmsedg602.ED.cps.intel.com (10.1.192.136) by fmsmsx606.amr.corp.intel.com (10.18.126.86) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.27 via Frontend Transport; Mon, 9 May 2022 20:43:28 -0700 Received: from NAM04-MW2-obe.outbound.protection.outlook.com (104.47.73.168) by edgegateway.intel.com (192.55.55.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2308.27; Mon, 9 May 2022 20:43:28 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=gkSw2qCOa/U0OA5TsbNF7+wypXR7yYyB6snO9X2oLSbYUjpXRN17hCwGHGnWssaf+F/MFV9raEEz7hV51T9Y8Z8drxXeN4LUjwrPHPyKaPGzSJir1eWARwCpUhmFNkbCgFA+BDuEEkIHMQQzDO5HXiGu87PL+m+++HXJTpDT91HYlCYV2skz1PDse9JOipFbEaaR9/WkBL3zo7QhK513zul/ndzV9kylYjXADCANeD6xFtWVJHDGV7+lSYCWoFe8kHOPcj+ZgfjHUlZ1/jKx4+NpO3b2hMKAc/PS3/s4CVDdJpE//71U6TAYFeg3sgTHoW7TjF7eP1CR2wJCc3MZpw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=PTSkyecn+rp9uEgvLHL/4IwuGLv8nKyWpY1Qrm4+6a8=; b=b808HKeG1WZ4jeHUgljgTLqj0Xx3idkNsaULiv77X3zgM2qgjDqnqDxwu53pSbK/QVfVMiyUTld79bneL/9nAw6P7rZamU+6FQ8bLSXZmWX2K0qXtu30gN2ySPEkezUhQXQFUmD4bQln9hxSStQE3Cv9iM4E7S44rUkhS+dLr3m5u0zQd6jHZS1benLZcPBpTkh8O7mcbWSlHHrH5eH0hdYaTlEp/N+ezNeBT+SBfC2M2n/w0Z3Q9NfwDuZa6MWSMnz8CBLzkoZQOc/QjH/MjOTtyokBtTiEomcvqRWP8SjVudLT5tOe88ttqnK+5EoacpUUZripwhe2zGhRqh/k4A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; Received: from BYAPR11MB3062.namprd11.prod.outlook.com (2603:10b6:a03:92::18) by BL0PR11MB3121.namprd11.prod.outlook.com (2603:10b6:208:7e::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5227.22; Tue, 10 May 2022 03:43:26 +0000 Received: from BYAPR11MB3062.namprd11.prod.outlook.com ([fe80::90eb:f194:6b40:c4ff]) by BYAPR11MB3062.namprd11.prod.outlook.com ([fe80::90eb:f194:6b40:c4ff%5]) with mapi id 15.20.5227.023; Tue, 10 May 2022 03:43:25 +0000 Message-ID: Date: Tue, 10 May 2022 11:43:16 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [mm/page_alloc] f26b3fa046: netperf.Throughput_Mbps -18.0% regression Content-Language: en-US To: "ying.huang@intel.com" CC: Mel Gorman , kernel test robot , Linus Torvalds , Vlastimil Babka , Dave Hansen , Jesper Dangaard Brouer , Michal Hocko , Andrew Morton , LKML , , , , , References: <20220420013526.GB14333@xsang-OptiPlex-9020> <7d20a9543f69523cfda280e3f5ab17d68db037ab.camel@intel.com> From: Aaron Lu In-Reply-To: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-ClientProxiedBy: HK2PR04CA0073.apcprd04.prod.outlook.com (2603:1096:202:15::17) To BYAPR11MB3062.namprd11.prod.outlook.com (2603:10b6:a03:92::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 2f91bc50-2830-4cb6-c2f5-08da32373cfd X-MS-TrafficTypeDiagnostic: BL0PR11MB3121:EE_ X-LD-Processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: a2QLU6ubGAFyNQ7GmAX35dv5N8qbnVECVgX6TUxqB2hpCTkefzShwltgWAHz5/VPNre48jMayJGrMnkvqO0mYygKKyWhp/EqY3BDKnqTVv3dTsvumOjoGaoWVj1BGXZBMpSUqs4Tmia/pGUxprRmAfyFMjsEOhqE3nHm4bYyfnLMMjcz2cUdYcwHxT0Y2oX54MyIjLHMX20j5w6A9uvkImiC/JsoL/jTCpcx4Gs+rlg1XMJh0k9GEKW4RhbQ8pjgKLcNTHzRqUcgs95uKovWPWsxhGowfg0Dlan7oTWYF9IVy0lsmqVJuvp0STDmqXh8wSn0XEnGABJmhZYRVwLPr5fU2qj39Olz2jE9hh6iWYx9sJDpRzf1t8YH9cEObSMTiHTkp7gSOyacl87V28E85ejI+tP0S5aNohISf9H9P7FcAYoha+bZLBeLd5bosxQzF3CqBLT7vUnS5wUxjCVNJzAgygZCkGz+s11RFIfdEDK9+JapqWG+IGtmODLSLPSeOG5bB6rfP1KNiAZ1gMaUVDJsCgzy+O8gHycOXADSfZOg3I75r+Ru2H/KjYSnyARkMPJu/qMP0P5e5dGlBxmcDSe9uULo4NXGR4DT6BPHPOJc6W6ZLZj9Qn5IUEO4ZurOs43WDJMLLlZ6dGwGNNiJjp64Rl/kR/z6uN3G8AavmHZsvhdoEoIH1/l1fpWnVGoIE+TfDX/jR/4YWYpyMtAi4SB2CLvfW3IGKMEgNyqq+JU= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BYAPR11MB3062.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230001)(366004)(53546011)(26005)(2616005)(6506007)(44832011)(83380400001)(5660300002)(31686004)(6512007)(2906002)(7416002)(8936002)(36756003)(508600001)(38100700002)(66946007)(66476007)(66556008)(4326008)(6862004)(8676002)(86362001)(6666004)(31696002)(82960400001)(186003)(6636002)(6486002)(54906003)(37006003)(316002)(43740500002)(45980500001);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?QUh5Sm9zS24rTkE3UWlpZW5aKzk5NlBxOC9SRmtRTTFRNTk1ZTkxUXJMTzdl?= =?utf-8?B?RXhMR3dlOThpZ0VKYXhpQ1EzUk5URXZwK0FwYjhMQy9PUTdiMFI4Z2QraHZs?= =?utf-8?B?S1M4aCtvSUFkYjBnTmlwb0h2Sm5Pd1J5aU9WajI4U05DUHpac0JnQWRZVVhQ?= =?utf-8?B?UnJCOGF5elhmbG1DNWpDS09DSG1vZ3dTVnhLa0JZQ1NNeWRUOFpKTG5wR3Ir?= =?utf-8?B?UHVXbVFMWFNzckZtckt0TVpWcUl1M1ZGSUdvZ0NSQnNDU1VHL1ZqTFFIZnk5?= =?utf-8?B?aDJkcU5XUGk5aThYUmxYZ0hYUkttRkl5OThOQk5zVU5vbUszNzJNaFdiRkFx?= =?utf-8?B?ZDA1NllrSWRlK09haFQva3BSenZOVEVKcWNlVW1xdHhRZ0hBK3FTQW80ZTZm?= =?utf-8?B?T1pXb1QxeXIrUWx5akNZN3hsdHFEUTByMUlwemx1MThzOWRPL2hQamFrT1dQ?= =?utf-8?B?czRCSmkzL2tQTDd1SGl5b3RwZDM0bDYvSGR6U21Sb0h5b0dZWXNXYlJVUTVw?= =?utf-8?B?Q1JyV3RxK3lEUmkzdXZXdDVybkFnNWRkS2tLRGNHNHBhR2YzdXZ6VkJLaTlF?= =?utf-8?B?V0lZcDJFNGVONFpxUW5qaFFuUlpKUE9iK0JSMmxYZXU3Q3R4d1NWWGFlQWlI?= =?utf-8?B?UlozVmZpY3F2N3ZRQjN5MmNEcEhYTVdjMmtteWhPSHZDU3VxZUUrbktDRTNG?= =?utf-8?B?eUFWYUxGaFIxMWVnWkJXbFFLQnkxT0FJeFRzSVloRnRSTEFibnVFdy8xV05V?= =?utf-8?B?bE1SQzRFRWIrOHRVMTA3VmFoVkdRK3FYRThaWlJReW0zOCs3aWxQcStNdCtM?= =?utf-8?B?RTFKSWVLdTY5ZkRaTlBabVFpVmJ4Q3hLSWpxOXNrMWU3MGpkemg3NDdlNGNU?= =?utf-8?B?QlRuMnVzYUFDOW8xZmVhU2I2djZ5U2VzQm9lbTkwQnlnWE9XS2xHU0FPbnNk?= =?utf-8?B?R1lFQ3R1Z2xEdm0zcVNOY3ZjQjlBbTNYVXpKc3FlTnNiN0J3K0JYQnFHeE5I?= =?utf-8?B?aVFES092Uy9LQTBYdy9rNDMvb014U3lucWozK0N0NzB0ZVlQTXV1R2V0cU5B?= =?utf-8?B?cWlLOWxsa04xMVQ1YWRHaGRHektDR3lIRy9mbVl1aXRseWpIVFR4WFhjMTNL?= =?utf-8?B?eDhCWitFNGdJdDhJN3cvQjVIbXZsSWdVUURyZ00vZ0Q5NXJYaXVhSElJaXI1?= =?utf-8?B?ZVN5WWZiY000Z3JRbUoyREdBeWdCQW1NWENBV05sdEprVzJIclhkZjFCK1NI?= =?utf-8?B?T3hhMGplSElpdWo2cXNkU1VDbGpKTWw1OHVRWjZ4dFFVRXRyWllBanhDSm1n?= =?utf-8?B?Sk9kbG5wZWsraCtvNkdIdDVIM2k5WmlKNmlvQ0hjM1M2UlY2S09kdU5Na1ps?= =?utf-8?B?aDdYV01vSXhtVk45Y1lUcm55bTZaNzhCbE5GSDRieEVYa0h1c3JOWGxWR0to?= =?utf-8?B?dG42a3dGK1V0VWpSSllLRXdvUDhwQXRNZE5HOHVuRW10b0JOZDN5V1Iya2Z0?= =?utf-8?B?SkhocWQra0RibTRqckpuN2ZJU2FEZ3hMcnVoOE94eWh4ZzZuc3dia2ZvSE90?= =?utf-8?B?NjhKcDVldTJmSW9iT01XMjF2ODJ4aUlhMTdRMWxSNE5vbXNIK3YwTUpHOW9N?= =?utf-8?B?ZEdFMDNtaDY1RUdxSmNtL2xJSERTSGdCZEs4TGlmOFBFeW5Fc1pZVHprZDFo?= =?utf-8?B?Y0lsZGZoT3ZSRzBrZWZsNjU3Qm5HUUVmWlNTekg5Zm9oRHhEUzVjUnVWRWVN?= =?utf-8?B?clg2azdqMEVuQW5PMGRyb2JMN3E1SjBUQy91WHJrRWkyUkx0N00zU1BLb1Bw?= =?utf-8?B?OUdyaGRVdFpCbm9MSXJ6dkFxODJkSThKWk1va1AyY2RuUVFJWEJwdS85UjVE?= =?utf-8?B?WC9pTmdvdWpwYkthQWRpTFp4UGhVS2lUYThocGlpcGF3WDBmejBVYmFJTXRU?= =?utf-8?B?RjdjcGRJdGp6NlAvWDZSaXhwRmJCZjVjTVhPQ0EzUVU2ZnB5R0V4a3U3Rks4?= =?utf-8?B?eFk3LzFkdU5kQk00K2xnc0hDbzA0MnpONE8vN3pZSU5XOUNja3BteE1EMGJv?= =?utf-8?B?eVp4bjVIWEVuenFjSTAzSmIyNjc2QnpCVFRvTWZ1cTNrYnpicDNiNDA5SkVW?= =?utf-8?B?cytwY2paWDBBcGF3MWlHczdpU2YyUmI5cDI0UXliQjc3a2pjcVoraXdNYXpj?= =?utf-8?B?QTkrVjNiQmZHaDFwdTRxWm5WY0dMSGU4THVublJ0UWJuQ21veHVJZ0wrSEJR?= =?utf-8?B?WXVPRXRBSGg3WGwxa2syUGl2VU1HbHhHclFpMEtzOWFlUG5qQWVkeXdXTjYv?= =?utf-8?B?M2JKWWpNVElZRXlsWGJLTjYvVHdtL0NNTnBGZWRSL2plU1BwR3Y5dz09?= X-MS-Exchange-CrossTenant-Network-Message-Id: 2f91bc50-2830-4cb6-c2f5-08da32373cfd X-MS-Exchange-CrossTenant-AuthSource: BYAPR11MB3062.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 May 2022 03:43:25.7324 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: Ub3d+2ik++YTGVNe3BuG0ZwYvgrLumCtKroNP6MxIMPeSgDwsd3iA6P+AxuwHcnm39ahNYb0TMM+Iats1qMWLA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL0PR11MB3121 X-OriginatorOrg: intel.com X-Spam-Status: No, score=-8.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/7/2022 3:44 PM, ying.huang@intel.com wrote: > On Sat, 2022-05-07 at 15:31 +0800, Aaron Lu wrote: ... ... >> >> I thought the overhead of changing the cache line from "shared" to >> "own"/"modify" is pretty cheap. > > This is the read/write pattern of cache ping-pong. Although it should > be cheaper than the write/write pattern of cache ping-pong in theory, we > have gotten sevious regression for that before. > Can you point me to the regression report? I would like to take a look, thanks. >> Also, this is the same case as the Skylake desktop machine, why it is a >> gain there but a loss here?  > > I guess the reason is the private cache size. The size of the private > L2 cache of SKL server is much larger than that of SKL client (1MB vs. > 256KB). So there's much more core-2-core traffic on SKL server. > It could be. The 256KiB L2 in Skylake desktop can only store 8 order-3 pages and that means the allocator side may have a higher chance of reusing a page that is evicted from the free cpu's L2 cache than the server machine, whose L2 can store 40 order-3 pages. I can do more tests using different high for the two machines: 1) high=0, this is the case when page reuse is the extreme. core-2-core transfer should be the most. This is the behavior of this bisected commit. 2) high=L2_size, this is the case when page reuse is fewer compared to the above case, core-2-core should still be the majority. 3) high=2 times of L2_size and smaller than llc size, this is the case when cache reuse is further reduced, and when the page is indeed reused, it shouldn't cause core-2-core transfer but can benefit from llc. 4) high>llc_size, this is the case when page reuse is the least and when page is indeed reused, it is likely not in the entire cache hierarchy. This is the behavior of this bisected commit's parent commit for the Skylake desktop machine. I expect case 3) should give us the best performance and 1) or 4) is the worst for this testcase. case 4) is difficult to test on the server machine due to the cap of pcp->high which is affected by the low watermark of the zone. The server machine has 128 cpus but only 128G memory, which makes the pcp->high capped at 421, while llc size is 40MiB and that translates to a page number of 12288. >> Is it that this "overhead" is much greater >> in server machine to the extent that it is even better to use a totally >> cold page than a hot one? > > Yes. And I think the private cache size matters here. And after being > evicted from the private cache (L1/L2), the cache lines of the reused > pages will go to shared cache (L3), that will help performance. > Sounds reasonable. >> If so, it seems to suggest we should avoid >> cache reuse in server machine unless the two CPUs happens to be two >> hyperthreads of the same core. > > Yes. I think so.