Received: by 2002:a05:6358:bb9e:b0:b9:5105:a5b4 with SMTP id df30csp5490472rwb; Wed, 7 Sep 2022 03:49:17 -0700 (PDT) X-Google-Smtp-Source: AA6agR4/G1Hg4esrX24kFpmaCS3GC+m3SkfINKwcTOxF3Vw/LKK+QBtCW5Jz2wfFNc7k5Up+Q2jz X-Received: by 2002:a17:902:bd49:b0:172:df88:8981 with SMTP id b9-20020a170902bd4900b00172df888981mr3369514plx.120.1662547756794; Wed, 07 Sep 2022 03:49:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662547756; cv=none; d=google.com; s=arc-20160816; b=eBAJwJLAjIPhb6OiHmDlackwqWfzvYprbPAGqmv29RYGe9ex2uayieA48oxX6Q6Tzx s53f8g7C4IfPIrPQMe8v/E0o9rbCcpKi7eJcofmIB1XsR3FPUtg+BaurjEJHqyCF9MCZ hxiYIGbrA34VnEf1eEVWazdqIFShw1yY69FOKQQXJOIeCYeDa0MjYBgqApDNZOydoA7y LThojFxD4cjDbCOZN1yQEhHot/D0X+u7t/aMqwf7LShNOtdUgPFAT0ave/0dv9Rf12Mb +JUNFdMzqyH/1lz/r5sVecrAl5RTF/B9xoij2lcpOjl4iG3cK6qLyLwUcy97WMqS+NWw dfEA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=nIvwLOlsSQ8WlESqo09zA/b/eOT/6iw39ON5AKzLbhc=; b=gzt2oMJZLkv4bIG+D4vdcKpgHtpie8XKyAX+WhGcSXGjHhAXm8QXQP7oDVXp3aZeJH Kx+kum2IXgpPSRX/GmEGT+vv8ClorLKojbsGVl532vaT8knG8O2Ei6ehdDQptcQwAmg7 SrQITIV6qljs8u/wKgTgvpMIE4J1GN+YOrGw9Vg7r7E6l1lwISeYo/axvM7MxZnMm3W7 WFEk8e2xquGIxKsMT6F+kObGgsiOc1v5/RuoJkkB2cs4SCXRV358f6FB75fOfLSeSa38 hjZVG6R+B4pXJGr9S/9n2o4ra7fNXtDzCeSXwQn+wZgFiADA8Sa38vwGZWtts9T1CA5L RLQg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Z32nobi8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q7-20020a170902bd8700b00177e3359771si208724pls.510.2022.09.07.03.49.04; Wed, 07 Sep 2022 03:49:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Z32nobi8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229732AbiIGJj5 (ORCPT + 99 others); Wed, 7 Sep 2022 05:39:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48804 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229513AbiIGJj4 (ORCPT ); Wed, 7 Sep 2022 05:39:56 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B883C22BD9 for ; Wed, 7 Sep 2022 02:39:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1662543594; x=1694079594; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=L9mAPti4dY/5Rd9k0BChHr2no1/RvmenIKFeJ7EdNQo=; b=Z32nobi8mG7ldJoPEptiJnIi1Nred9ngW76vdO0XrkLDIXWoxPao75GL KEee65Dh/+wsowAvOyFeXyVid5FVlqkjHg4eOSB1LeLXRbY0SglhaR+IB NgaP9KcdE2sLXkbtMAH75EPzT+Vc1CZo9snMGKIVSiXoHCOcpIWDDvR9i XEeQeNIgjPxXIu0bDjbv6mkh90GYg3t8t9QI1stORIqlPB3NGiE89dS+Y EHhUoNXN/DQjAjUjcOXYAyeXQ7AtttIv8PpvbbNy8igVxsreKcZNnUzxq vSOJX3xY2OAR5eBu3eeap9WyFJq2zFCZ8wOCTui5aSRHvW1kNhiBXQ7mT A==; X-IronPort-AV: E=McAfee;i="6500,9779,10462"; a="294412571" X-IronPort-AV: E=Sophos;i="5.93,296,1654585200"; d="scan'208";a="294412571" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Sep 2022 02:39:54 -0700 X-IronPort-AV: E=Sophos;i="5.93,296,1654585200"; d="scan'208";a="591621518" Received: from jiebinsu-mobl.ccr.corp.intel.com (HELO [10.238.0.228]) ([10.238.0.228]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Sep 2022 02:39:49 -0700 Message-ID: Date: Wed, 7 Sep 2022 17:39:47 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.2.1 Subject: Re: [PATCH] ipc/msg.c: mitigate the lock contention with percpu counter Content-Language: en-US To: Tim Chen , Shakeel Butt Cc: Andrew Morton , vasily.averin@linux.dev, Dennis Zhou , Tejun Heo , Christoph Lameter , "Eric W. Biederman" , Alexey Gladkov , Manfred Spraul , alexander.mikhalitsyn@virtuozzo.com, Linux MM , LKML , "Chen, Tim C" , Feng Tang , Huang Ying , tianyou.li@intel.com, wangyang.guo@intel.com, jiebin.sun@intel.com References: <20220902152243.479592-1-jiebin.sun@intel.com> <048517e7f95aa8460cd47a169f3dfbd8e9b70d5c.camel@linux.intel.com> From: "Sun, Jiebin" In-Reply-To: <048517e7f95aa8460cd47a169f3dfbd8e9b70d5c.camel@linux.intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/7/2022 2:44 AM, Tim Chen wrote: > On Fri, 2022-09-02 at 09:27 -0700, Shakeel Butt wrote: >> On Fri, Sep 2, 2022 at 12:04 AM Jiebin Sun wrote: >>> The msg_bytes and msg_hdrs atomic counters are frequently >>> updated when IPC msg queue is in heavy use, causing heavy >>> cache bounce and overhead. Change them to percpu_counters >>> greatly improve the performance. Since there is one unique >>> ipc namespace, additional memory cost is minimal. Reading >>> of the count done in msgctl call, which is infrequent. So >>> the need to sum up the counts in each CPU is infrequent. >>> >>> Apply the patch and test the pts/stress-ng-1.4.0 >>> -- system v message passing (160 threads). >>> >>> Score gain: 3.38x >>> >>> CPU: ICX 8380 x 2 sockets >>> Core number: 40 x 2 physical cores >>> Benchmark: pts/stress-ng-1.4.0 >>> -- system v message passing (160 threads) >>> >>> Signed-off-by: Jiebin Sun >> [...] >>> +void percpu_counter_add_local(struct percpu_counter *fbc, s64 amount) >>> +{ >>> + this_cpu_add(*fbc->counters, amount); >>> +} >>> +EXPORT_SYMBOL(percpu_counter_add_local); >> Why not percpu_counter_add()? This may drift the fbc->count more than >> batch*nr_cpus. I am assuming that is not the issue for you as you >> always do an expensive sum in the slow path. As Andrew asked, this >> should be a separate patch. > In the IPC case, the read is always done with the accurate read using > percpu_counter_sum() gathering all the counts and > never with percpu_counter_read() that only read global count. > So Jiebin was not worry about accuracy. > > However, the counter is s64 and the local per cpu counter is S32. > So the counter size has shrunk if we only keep the count in local per > cpu counter, which can overflow a lot sooner and is not okay. > > Jiebin, can you try to use percpu_counter_add_batch, but using a large > batch size. That should achieve what you want without needing > to create a percpu_counter_add_local() function, and also the overflow > problem. > > Tim > I have sent out the patch v4 which use percpu_counter_add_batch. If we use a tuned large batch size (1024), the performance gain is 3.17x (patch v4) vs 3.38x (patch v3) previously in stress-ng -- message. It still has significant performance improvement and also good balance between performance gain and overflow issue. Jiebin >