Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp3468373rwb; Mon, 19 Sep 2022 23:16:10 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6Zmwza6fFnhDvwqFCjXIg9hCNm6S3UvQb9FRkBLa3EQBF9g6i5brEjl/5gh3fGQ0Bezq7G X-Received: by 2002:a17:907:2da1:b0:773:dc01:877a with SMTP id gt33-20020a1709072da100b00773dc01877amr15443697ejc.567.1663654569726; Mon, 19 Sep 2022 23:16:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663654569; cv=none; d=google.com; s=arc-20160816; b=f4dDmMXV8DswKjwuvymXMn2QU1UeVH8Hnqlsp541Lsa4gxNDqMgs0k4dT32KzX6Nze UhfWyYR6rUFeyZmOJ3UuPIr5j0hwDfmZqezBmpalLMbmI3TbO/VLqkUvMRNjpX3eVTDg Bh5YTjbxrN02Pqzf4ff2orIN8LIt3Aj/Sc8IWI03oozcJBDjEEHrqsjjYsXQIuGmq6qR dXLxjKj5Xx47NAG0SfwhFuFdDtXx3kAMP52TxKg+4dHEAHVnLW4WUXQcWi4EcymRBajB 0O1ai/JzPLtwQusBKora1PvsMCuvww1N34Vo71sICFSV5MrRoBT96ChNVF+nMPnilX2Z pJig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=9/NzavPB3hFDPHdwYmjyyfE1pQzLDjFnSae1AembNgo=; b=H1RqCNU862oXS8Dnj3H/NnBEtcqgdT+oO+a4gXjScPBZkHRHpsRLp9JEDqVVSW9wuv 4okR8wA8gF0QV5washO956ximD8NRvnkbo53IkmYbj4G91ZTQkI18AWIfXwaUmn2ril9 YM/7b24i4G0MXv6oKqTnm33Lia45xmdAH592iuL9OU9UZszGu0/WLNODkxHKKUXTvFB6 VqK50VXK6/49JFhdpFgsnVIoNymuch1syRrAJXcb1fc+iri55oyeohHiB9D04Lj1zo8Q epK0zSmaHy/n8RP27d2MDMkPaCYRI31urujWtc5BlYi/qkYA0IIP53S6Iq6fQCg8piJf d/qw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=gBlydoEc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sh44-20020a1709076eac00b00773b8d05d4esi502239ejc.295.2022.09.19.23.15.43; Mon, 19 Sep 2022 23:16:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=gBlydoEc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230027AbiITFub (ORCPT + 99 others); Tue, 20 Sep 2022 01:50:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43226 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229903AbiITFu2 (ORCPT ); Tue, 20 Sep 2022 01:50:28 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F0B992DABF for ; Mon, 19 Sep 2022 22:50:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663653026; x=1695189026; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=azjpc/W1xgm63w6co4quNz/N14+KNfXPM4ToOaCNS2Q=; b=gBlydoEcGQYi6cb8MlIDQ2zpW+1coarkjM4ITuVayDWFMGLfd6eGJlLQ VmsPskMRe4i8LwJ9EwWd67SKcln7aFJag7EFvR1MpHpA01Pk02i+G9IXA AyMaoKmlkd65+K1oYT3we/QHJ1n+uZNI6wYcehiuJSVsToPGdu9BrbYEa +TxTmISTQ2Xsej56CZVj1tK4U3svzauLSVn+XtaNejVkDjMU0BDZsqbTH VYX87YmwnyBdOxMhSATxmrqnS5axu1pQOpNbnNby5UK/+8hNTxeuyWgne 4EHpbQn55XXhLtJUH/s0NAs1TC1/Hm3Qfu5eSfY+bhCjZqsF31m4xtQej w==; X-IronPort-AV: E=McAfee;i="6500,9779,10475"; a="300423156" X-IronPort-AV: E=Sophos;i="5.93,329,1654585200"; d="scan'208";a="300423156" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Sep 2022 22:50:26 -0700 X-IronPort-AV: E=Sophos;i="5.93,329,1654585200"; d="scan'208";a="569944306" Received: from jiebinsu-mobl.ccr.corp.intel.com (HELO [10.238.4.108]) ([10.238.4.108]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Sep 2022 22:50:22 -0700 Message-ID: Date: Tue, 20 Sep 2022 13:50:20 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.2.1 Subject: Re: [PATCH v6 2/2] ipc/msg: mitigate the lock contention with percpu counter Content-Language: en-US To: Manfred Spraul , akpm@linux-foundation.org, vasily.averin@linux.dev, shakeelb@google.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, ebiederm@xmission.com, legion@kernel.org, alexander.mikhalitsyn@virtuozzo.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tim.c.chen@intel.com, feng.tang@intel.com, ying.huang@intel.com, tianyou.li@intel.com, wangyang.guo@intel.com, Tim Chen References: <20220902152243.479592-1-jiebin.sun@intel.com> <20220913192538.3023708-1-jiebin.sun@intel.com> <20220913192538.3023708-3-jiebin.sun@intel.com> <6ed22478-0c89-92ea-a346-0349be2dd99c@intel.com> <8d74a7d4-b80f-2a0f-ee95-243bdbd51ccd@colorfullife.com> From: "Sun, Jiebin" In-Reply-To: <8d74a7d4-b80f-2a0f-ee95-243bdbd51ccd@colorfullife.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/20/2022 12:53 PM, Manfred Spraul wrote: > On 9/20/22 04:36, Sun, Jiebin wrote: >> >> On 9/18/2022 8:53 PM, Manfred Spraul wrote: >>> Hi Jiebin, >>> >>> On 9/13/22 21:25, Jiebin Sun wrote: >>>> The msg_bytes and msg_hdrs atomic counters are frequently >>>> updated when IPC msg queue is in heavy use, causing heavy >>>> cache bounce and overhead. Change them to percpu_counter >>>> greatly improve the performance. Since there is one percpu >>>> struct per namespace, additional memory cost is minimal. >>>> Reading of the count done in msgctl call, which is infrequent. >>>> So the need to sum up the counts in each CPU is infrequent. >>>> >>>> Apply the patch and test the pts/stress-ng-1.4.0 >>>> -- system v message passing (160 threads). >>>> >>>> Score gain: 3.99x >>>> >>>> CPU: ICX 8380 x 2 sockets >>>> Core number: 40 x 2 physical cores >>>> Benchmark: pts/stress-ng-1.4.0 >>>> -- system v message passing (160 threads) >>>> >>>> Signed-off-by: Jiebin Sun >>>> Reviewed-by: Tim Chen >>> Reviewed-by: Manfred Spraul >>>> @@ -495,17 +496,18 @@ static int msgctl_info(struct ipc_namespace >>>> *ns, int msqid, >>>>       msginfo->msgssz = MSGSSZ; >>>>       msginfo->msgseg = MSGSEG; >>>>       down_read(&msg_ids(ns).rwsem); >>>> -    if (cmd == MSG_INFO) { >>>> +    if (cmd == MSG_INFO) >>>>           msginfo->msgpool = msg_ids(ns).in_use; >>>> -        msginfo->msgmap = atomic_read(&ns->msg_hdrs); >>>> -        msginfo->msgtql = atomic_read(&ns->msg_bytes); >>>> +    max_idx = ipc_get_maxidx(&msg_ids(ns)); >>>> +    up_read(&msg_ids(ns).rwsem); >>>> +    if (cmd == MSG_INFO) { >>>> +        msginfo->msgmap = percpu_counter_sum(&ns->percpu_msg_hdrs); >>>> +        msginfo->msgtql = percpu_counter_sum(&ns->percpu_msg_bytes); >>> >>> Not caused by your change, it just now becomes obvious: >>> >>> msginfo->msgmap and ->msgtql are type int, i.e. signed 32-bit, and >>> the actual counters are 64-bit. >>> This can overflow - and I think the code should handle this. Just >>> clamp the values to INT_MAX. >>> >> Hi Manfred, >> >> Thanks for your advice. But I'm not sure if we could fix the overflow >> issue in ipc/msg totally by >> >> clamp(val, low, INT_MAX). If the value is over s32, we might avoid >> the reversal sign, but still could >> >> not get the accurate value. > > I think just clamping it to INT_MAX is the best approach. > Reporting negative values is worse than clamping. If (and only if) > there are real users that need to know the total amount of memory > allocated for messages queues in one namespace, then we could add a > MSG_INFO64 with long values. But I would not add that right now, I do > not see a real use case where the value would be needed. > > Any other opinions? > > -- > >     Manfred > > OK. I will work on it and send it out for review.