Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp1162726pxk; Fri, 18 Sep 2020 05:36:04 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz1m3bSRUzwXPRF75SdnrQPsPYzInSGH/KcfakMnpy9tmrYAs52VLFzmg9lHq9XkT4RUFtD X-Received: by 2002:a17:906:af92:: with SMTP id mj18mr35289085ejb.242.1600432564645; Fri, 18 Sep 2020 05:36:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600432564; cv=none; d=google.com; s=arc-20160816; b=pWUpdgk8BCjrTODRXTu8h4JpCoGTi+ZIF43ISycnPCbn218+JL38WHZd9zqW+V3V7Y eUgVCImmblmCAbqtKEq7DYJoYvyB+IJJDYbvHGCMqv9VKRopZd9RKEW0PD6oHopFcxU3 iWN0scayWPkKfm8TzRQeTUAqTPoBu9aETq2IGDoRAJekGiuwXcOXqzN8YE9nAmE+A5Xr fgFHIHDZA2IoXts1W2juVSRChppzx1x96a/1A5FLI6Mcc28bv16Y5Ob1fcRYEp6tVuqA FsktC5erVYBSpt3itcWOq5Olko1CuEYyniJP4zgxaFuJ7OYnQtrwlUt3Fq2mDgd8A073 WBjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:content-transfer-encoding :content-language:accept-language:in-reply-to:references:message-id :date:thread-index:thread-topic:subject:cc:to:from; bh=YY4boTgZkpUBkUjJ94R0h6L8/71rDGpY/iCYa5za3t0=; b=aVkD1jinv61KW+vR7XjQKejlmcMidLIQTwxqepTE/hTB1TrKfSD71XuWvv9+F9/Hva /UB0QIEOlZl/1Ny3pdpQDVhEnSNGoD6sRsWcBs6oFOZ21wCRCZsbhPRRB0DjOTkwXv4r O/qEXot0Jn+rKFxs+RpEBRdusnKGLsNXewAmj0cg+rHxNg/60Y0cmxI93YC2oy8vLfYd 3gB/wwOpjZAt47UUA1ZOQYOAPcDZAd/spb0HOCVIqQuDRsW61Qt6PG9Tep+BB/hUTYmB lfEwFJN87O0IBafkgZhbg3cj3lBWBsPUmaT08KzI5CURmegsU2rbjAvcG/l9qGj5hDp6 w5og== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z12si2262780edm.560.2020.09.18.05.35.41; Fri, 18 Sep 2020 05:36:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726640AbgIRMdk convert rfc822-to-8bit (ORCPT + 99 others); Fri, 18 Sep 2020 08:33:40 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:3546 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726064AbgIRMdk (ORCPT ); Fri, 18 Sep 2020 08:33:40 -0400 X-Greylist: delayed 924 seconds by postgrey-1.27 at vger.kernel.org; Fri, 18 Sep 2020 08:33:40 EDT Received: from dggemi406-hub.china.huawei.com (unknown [172.30.72.55]) by Forcepoint Email with ESMTP id 9A48BA5AE5313591991A; Fri, 18 Sep 2020 20:18:14 +0800 (CST) Received: from DGGEMI522-MBS.china.huawei.com ([169.254.8.78]) by dggemi406-hub.china.huawei.com ([10.3.17.144]) with mapi id 14.03.0487.000; Fri, 18 Sep 2020 20:18:04 +0800 From: lushenming To: Marc Zyngier CC: Thomas Gleixner , Jason Cooper , "linux-kernel@vger.kernel.org" , "Wanghaibin (D)" , yuzenghui Subject: RE: [PATCH] irqchip/gic-v4.1: Optimize the delay time of the poll on the GICR_VPENDBASER.Dirty bit Thread-Topic: [PATCH] irqchip/gic-v4.1: Optimize the delay time of the poll on the GICR_VPENDBASER.Dirty bit Thread-Index: AdaLYvuHeQKSQv5lQ9G9IAytgjmwLv//klKA//55OFCAArI6AP/8SMLg Date: Fri, 18 Sep 2020 12:18:03 +0000 Message-ID: <343E0E168479F04FACCB176989D12DE7EE333B@dggemi522-mbs.china.huawei.com> References: <343E0E168479F04FACCB176989D12DE7EE1D2D@dggemi522-mbs.china.huawei.com> <343E0E168479F04FACCB176989D12DE7EE3206@dggemi522-mbs.china.huawei.com> <8c9f4731295af025302e084ba546b74b@kernel.org> In-Reply-To: <8c9f4731295af025302e084ba546b74b@kernel.org> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.174.187.99] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Marc, I measured the time from vcpu_load() (include it) to __guest_enter() on Kunpeng 920. On average, It takes 2.55 microseconds (not first run && the VPT is empty). So waiting for 10 microseconds in vcpu scheduling really hurts performance. And I agree that delaying the execution of its_wait_vpt_parse_complete() might be a viable solution. -----Original Message----- From: Marc Zyngier [mailto:maz@kernel.org] Sent: 2020-09-16 16:40 To: lushenming Cc: Thomas Gleixner ; Jason Cooper ; linux-kernel@vger.kernel.org; Wanghaibin (D) ; yuzenghui Subject: Re: [PATCH] irqchip/gic-v4.1: Optimize the delay time of the poll on the GICR_VPENDBASER.Dirty bit On 2020-09-16 08:04, lushenming wrote: > Hi, > > Our team just discussed this issue again and consulted our GIC > hardware design team. They think the RD can afford busy waiting. So we > still think maybe 0 is better, at least for our hardware. > > In addition, if not 0, as I said before, in our measurement, it takes > only hundreds of nanoseconds, or 1~2 microseconds, to finish parsing > the VPT in most cases. So maybe 1 microseconds, or smaller, is more > appropriate. > Anyway, 10 microseconds is too much. > > But it has to be said that it does depend on the hardware > implementation. Exactly. And given that the only publicly available implementation is a software model, I am reluctant to change "performance" related things based on benchmarks that can't be verified and appears to me as a micro optimization. > Besides, I'm not sure where are the start and end point of the total > scheduling latency of a vcpu you said, which includes many events. Is > the parse time of the VPT not clear enough? Measure the time it takes from kvm_vcpu_load() to the point where the vcpu enters the guest. How much, in proportion, do these 1/2/10ms represent? Also, a better(?) course of action would maybe to consider whether we should split the its_vpe_schedule() call into two distinct operations: one that programs the VPE to be resident, and another that poll the Dirty bit *much later* on the entry path, giving the GIC a chance to work in parallel with the CPU on the entry path. If your HW is a quick as you say it is, it would pretty much guarantee a clear read of GICR_VPENDBASER without waiting. M. -- Jazz is not dead. It just smells funny...