Received: by 2002:a05:7412:bc1a:b0:d7:7d3a:4fe2 with SMTP id ki26csp1341597rdb; Mon, 21 Aug 2023 05:09:01 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHsEPC3RiJDO3MnuBqzYwEfSKkv4Ej2sCXS/eb/wlxPNzUuBrpD3FLPB06AZi4M8UVkj+vj X-Received: by 2002:adf:e886:0:b0:313:ea84:147a with SMTP id d6-20020adfe886000000b00313ea84147amr4760534wrm.64.1692619740825; Mon, 21 Aug 2023 05:09:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692619740; cv=none; d=google.com; s=arc-20160816; b=eY0V9L0+ifMTSHRoYML5uJYj5Hxij2PDcSmfskGJHJHMG3q6YTOEbFCliqBzooYgk7 08G+g5ERJUpkmMorq1tftYc4JDDlLyMRNZVn7kp3qVzh1Eeyssm7NBOT0y+Tu7v8G/ZZ 0MH0Wy4EKMrDp5WbWP9MCc0AZcfexrdrz7NX8j14giX0r+RoLjd8h38Zun7oZnL2WGSS 6ya73UAb3h/UIB7VVvhpSdLKuqcE8xXd3pErp8v+KBZiMTghzErtqZyERqrrtby2x38+ c7mPSHX2Exiepa4kXbupCw1W1avdzbK78y3Sn0LBTdk3JxNwvJEe6cwq6iWQUcMmy7KV 63Yw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:in-reply-to :subject:cc:to:from:message-id:date:dkim-signature; bh=B8A8W5p7iMAC+OkJfvCNz26D61U1rRRPqcK/8UDR1g0=; fh=lQ3RiZseJo8We+BLTjX0TvyuL1GgK6bTyMp5eSk6ZxM=; b=jV375rlcSnhXZwVxYq2e+FTYu16/z13G5mA3FJC6vAWp1vGnj7rzyHRwc8xo8DFv9O JGzZhqLwdXg+4s/Rs+xZIN1TUDQo9ttMqTQCluwKdHeJDgowZ6Tl+KboxcW5UDggbJOn RcuroURZjVzYNK3rToJk02i2wfPHs8Q2BKL1eWbGbBWNnb/ahtJMPgy2FE7Lsp+jk0QY 3UcqwyGSt2K8mpKrcVYd0u7XUWTuLmCITDDPlgix2+mwC25gd4cD2kXHG4l+OQDuSu4i wcKKMOnmIRRsSi8wRVTSSYNaPWBN8G42r2neSlc8gn2ssMKgV1ZtEeolXbsNmZHCWV4B q6Ow== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=qzYufCGD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f17-20020a17090660d100b0099e02fe7b8csi5444804ejk.799.2023.08.21.05.08.36; Mon, 21 Aug 2023 05:09:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=qzYufCGD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234591AbjHUKQh (ORCPT + 99 others); Mon, 21 Aug 2023 06:16:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58894 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231183AbjHUKQg (ORCPT ); Mon, 21 Aug 2023 06:16:36 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1DD54CA; Mon, 21 Aug 2023 03:16:35 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id AAC16625BE; Mon, 21 Aug 2023 10:16:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DCB31C433C8; Mon, 21 Aug 2023 10:16:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1692612993; bh=ail/1jTb7rtKZsa1M8PPf3UgwIcIYbTAUY5HqpbucYo=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=qzYufCGDNqKSOeiMFwBL0ylCmXpR4q3AVdruQLyys1dr0CXOiq75HsZhkInPF2L1/ PIfOLJN5se8zMJOC8yJ/Woc3smojyS68uYgw8my2BojKIRXlt3KGuhpFLZk0hGTeLo w0hAhux3WGCRG31rWGXbsgZFQCV3FZBZgyIMMzP3zBrqEMSQ4VTnZfgFo5lKCmhxDS OLbl6uELfBH5Cvsw8iuAxarNPxBHWT0s2ArYA+Y0efVVeTAPyQ1FU5v+GSBZNnhiKH tNyqWYJlxT6lea36bngFgIiMgHWYAZ/lAoeZtfg15uWnTCbMMS7vHd/pwN69rvwmm4 vjx6mv8OYCjvQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1qY1xX-006bFJ-9q; Mon, 21 Aug 2023 11:16:31 +0100 Date: Mon, 21 Aug 2023 11:16:30 +0100 Message-ID: <86msykg0ox.wl-maz@kernel.org> From: Marc Zyngier To: Xu Zhao Cc: pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, zhouyibo@bytedance.com, zhouliang.001@bytedance.com, Oliver Upton , kvmarm@lists.linux.dev, Mark Rutland Subject: Re: [RFC] KVM: arm/arm64: optimize vSGI injection performance In-Reply-To: References: <20230818104704.7651-1-zhaoxu.35@bytedance.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: zhaoxu.35@bytedance.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, zhouyibo@bytedance.com, zhouliang.001@bytedance.com, oliver.upton@linux.dev, kvmarm@lists.linux.dev, mark.rutland@arm.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 21 Aug 2023 09:59:17 +0100, Mark Rutland wrote: > > [adding the KVM/arm64 maintainers & list] Thanks for that. > > Mark. > > On Fri, Aug 18, 2023 at 06:47:04PM +0800, Xu Zhao wrote: > > In the worst case scenario, it may iterate over all vCPUs in the vm in order to complete > > injecting an SGI interrupt. However, the ICC_SGI_* register provides affinity routing information, > > and we are interested in exploring the possibility of utilizing this information to reduce iteration > > times from a total of vcpu numbers to 16 (the length of the targetlist), or even 8 times. > > > > This work is based on v5.4, and here is test data: This is a 4 year old kernel. I'm afraid you'll have to provide something that is relevant to a current (e.i. v6.5) kernel. > > 4 cores with vcpu pinning: > > | ipi benchmark | vgic_v3_dispatch_sgi | > > | original | with patch | impoved | original | with patch | impoved | > > | core0 -> core1 | 292610285 ns | 299856696 ns | -2.5% | 1471 ns | 1508 ns | -2.5% | > > | core0 -> core3 | 333815742 ns | 327647989 ns | +1.8% | 1578 ns | 1532 ns | +2.9% | > > | core0 -> all | 439754104 ns | 433987192 ns | +1.3% | 2970 ns | 2875 ns | +3.2% | > > > > 32 cores with vcpu pinning: > > | ipi benchmark | vgic_v3_dispatch_sgi | > > | original | with patch | impoved | original | with patch | impoved | > > | core0 -> core1 | 269153219 ns | 261636906 ns | +2.8% | 1743 ns | 1706 ns | +2.1% | > > | core0 -> core31 | 685199666 ns | 355250664 ns | +48.2% | 4238 ns | 1838 ns | +56.6% | > > | core0 -> all | 7281278980 ns | 3403240773 ns | +53.3% | 30879 ns | 13843 ns | +55.2% | > > > > Based on the test results, the performance of vm with less than 16 cores remains almost the same, > > while significant improvement can be observed with more than 16 > > cores. This triggers multiple questions: - what is the test being used? on what hardware? how can I reproduce this data? - which current guest OS *currently* make use of broadcast or 1:N SGIs? Linux doesn't and overall SGI multicasting is pretty useless to an OS. [...] > > /* > > - * Compare a given affinity (level 1-3 and a level 0 mask, from the SGI > > - * generation register ICC_SGI1R_EL1) with a given VCPU. > > - * If the VCPU's MPIDR matches, return the level0 affinity, otherwise > > - * return -1. > > + * Get affinity routing index from ICC_SGI_* register > > + * format: > > + * aff3 aff2 aff1 aff0 > > + * |- 8 bits -|- 8 bits -|- 8 bits -|- 4 bits or 8bits -| OK, so you are implementing RSS support: - Why isn't that mentioned anywhere in the commit log? - Given that KVM actively limits the MPIDR to 4 bits at Aff0, how does it even work the first place? - How is that advertised to the guest? - How can the guest enable RSS support? This is not following the GICv3 architecture, and I'm sceptical that it actually works as is (I strongly suspect that you have additional patches...). M. -- Without deviation from the norm, progress is not possible.