Received: by 2002:ac0:e350:0:0:0:0:0 with SMTP id g16csp2425141imn; Tue, 2 Aug 2022 02:14:47 -0700 (PDT) X-Google-Smtp-Source: AGRyM1spKwIMm784I6ExmLfN5vTkoBxQxGJJL5Dmsvmfo7DIkyMBLhZ4omA++U6ARnW4l5QFCfnb X-Received: by 2002:a63:5423:0:b0:41a:619f:6b98 with SMTP id i35-20020a635423000000b0041a619f6b98mr16460318pgb.477.1659431687605; Tue, 02 Aug 2022 02:14:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659431687; cv=none; d=google.com; s=arc-20160816; b=Ai4pH9BCBopqYZnOHsoDdjfw8HZAHl5UzSzpYaJSX+TiSrk0m5ULnqiKwjWH1VoSQB 8WIljj6V1+VoPvKeA5XTL+s9nB1zadVvSto8/nWMSIwR6reqlt0ognqdMwZXC4O6cNHM D6sGG+bTh+kCYJIzZbwcQpRxuadltV+UWMNRC9t19ujZ1JAIuNlp0/ir8iDyEx3mcsuA xRMkgladBOR24GQCrPYFwec6uMctUUKFwF+/69nr9loaMWovG62n/eyvaX1AdQ9v2nJp +cKK7jjEh8aVNF5d5JFfW4ebAPcUH1D+R287UUUYC+wZFTC3zCu4aiIBftSxIF4fC50e tQ+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id; bh=le1hJCAnkbiiVPhC4my9eQWHaCaN1alugk0OKxH+wvQ=; b=tItRU7RcG098emYXM1FkFekw6DZrwK4e5q/nFIxs8FtQm6rCrjUKWIOZs4HCLwBFRl c78Nh7kpf5LiXF8wOxQOxLl/W2IIaUqKghggUv4r2qihQNdVZ2f+4b8nOzlDAggsfQX5 pZICVi27qQtgP2esC+oPoCgZmrCAwlBpThylpfv2VwJVjYLRl2qzxnlqKCL9F198saew l2hGguz5BvZ7rhW/bjqF54a2PEFGyB73Vh2WqFhtQCOt9jTSEaIZZoiEiVKo1XVQD2UV GhTet3v1N4veV0CqVwlpQmMxXPBY0rjO341H0LUPjM/tAN29KnUIK50w9fGKkdi7cMVT QhcQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d189-20020a6368c6000000b0041c2c6ad5fesi4874131pgc.376.2022.08.02.02.14.33; Tue, 02 Aug 2022 02:14:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235960AbiHBJIy (ORCPT + 99 others); Tue, 2 Aug 2022 05:08:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36890 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233918AbiHBJIw (ORCPT ); Tue, 2 Aug 2022 05:08:52 -0400 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 246901E3F3; Tue, 2 Aug 2022 02:08:51 -0700 (PDT) Received: from fraeml740-chm.china.huawei.com (unknown [172.18.147.206]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4LxpyB6WQYz682wj; Tue, 2 Aug 2022 17:06:26 +0800 (CST) Received: from lhrpeml500003.china.huawei.com (7.191.162.67) by fraeml740-chm.china.huawei.com (10.206.15.221) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Tue, 2 Aug 2022 11:08:48 +0200 Received: from [10.195.33.92] (10.195.33.92) by lhrpeml500003.china.huawei.com (7.191.162.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Tue, 2 Aug 2022 10:08:47 +0100 Message-ID: <3d0c1ec0-42ec-8c51-743b-5d93cabb53fb@huawei.com> Date: Tue, 2 Aug 2022 10:08:47 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [PATCH v3 00/17] Compress the pmu_event tables To: Ian Rogers CC: Will Deacon , James Clark , "Mike Leach" , Leo Yan , Peter Zijlstra , Ingo Molnar , "Arnaldo Carvalho de Melo" , Mark Rutland , "Alexander Shishkin" , Jiri Olsa , Namhyung Kim , Andi Kleen , "Zhengjun Xing" , Ravi Bangoria , Kan Liang , Adrian Hunter , , , , Stephane Eranian References: <20220729074351.138260-1-irogers@google.com> From: John Garry In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.195.33.92] X-ClientProxiedBy: lhreml717-chm.china.huawei.com (10.201.108.68) To lhrpeml500003.china.huawei.com (7.191.162.67) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 29/07/2022 18:27, Ian Rogers wrote: >> This implementation would require core pmu.c to be changed, but there is >> ways that this could be done without needing to change core pmu.c >> >> Thanks, >> John > Thanks John! > > You are right about broadwell, it is an extreme case of sharing. IIRC > BDX is the server core/uncore events, BDW is the consumer core/uncore > and BDW-DE is consumer core with server uncore - so the sharing is > inherent in this. Metrics become interesting as they may mix core and > uncore, but I'll ignore that here. > > In the old code every event needs 15 char*s, with 64-bit that is 15*8 > bytes per entry with 741 core and 23 uncore entries for BDW, and 372 > core and 1284 uncore for BDX. I expect the strings themselves will be > shared by the C compiler, and so I just focus on the pointer sizes. > With the new code every event is just 1 32-bit int. So for BDW we go > from 15*8*(741+23)=91,680 to 4*(741+23)=3056, BDX is > 15*8*(372+1284)=198720 to 4*(372+1284)=6624. This means we've gone > from 290,400bytes to 9,680bytes for BDW and BDX. BDW-DE goes from > 243,000bytes to 8,100bytes - > we can ignore the costs of the strings as > they should be fully shared, especially for BDW-DE. Are you sure about this? I did not think that the compiler would have the intelligence to try to share strings. That is the basis of the size optimisation which I was proposing (that the compiler would not share strings). > > If we added some kind of table sharing, so BDW-DE was core from BDW > and uncore from BDX and the tables shared, then in the old code you > could save nearly 0.25MB but with the new code the saving is only > around 8KB. I think we can go after that 8KB but it is less urgent > after this change which gets 96% of the benefit. We have 72 > architectures for jevents at the moment and so I'd ball park (assuming > they all saved as much as BDW-DE) the max saving as about 0.5MB, which > is 12% of what is saved here. > > Longer term I'd like to make the pmu-events.c logic look closer to the > sysfs API. Perhaps we can unify the uncore events for BDX and BDW-DE > with some notion of a common PMU, or PMUs with common events and > tables, and automate deduction of this. It also isn't clear to me the > advantage of storing the metrics inside the events, separate tables > feel cleaner. Anyway, there's lots of follow up. Thanks, John