Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752810AbdLMLcK (ORCPT ); Wed, 13 Dec 2017 06:32:10 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:34700 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752415AbdLMLcF (ORCPT ); Wed, 13 Dec 2017 06:32:05 -0500 Subject: Re: [PATCH] IPI performance benchmark To: Yury Norov Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Andrew Morton , Ashish Kalra , Christoffer Dall , Geert Uytterhoeven , Linu Cherian , Sunil Goutham References: <20171211141600.24401-1-ynorov@caviumnetworks.com> <27744344-f4f3-0952-94e2-d80071a137a7@de.ibm.com> <20171211145557.mooeknrcdfw53qpz@yury-thinkpad> <20171213112355.s3ubggurwx4v3r53@yury-thinkpad> From: Christian Borntraeger Date: Wed, 13 Dec 2017 12:31:56 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <20171213112355.s3ubggurwx4v3r53@yury-thinkpad> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 x-cbid: 17121311-0016-0000-0000-0000050C9A1E X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17121311-0017-0000-0000-00002848AFEC Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-12-13_05:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1712130164 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2891 Lines: 77 On 12/13/2017 12:23 PM, Yury Norov wrote: > On Mon, Dec 11, 2017 at 05:30:25PM +0100, Christian Borntraeger wrote: >> >> >> On 12/11/2017 03:55 PM, Yury Norov wrote: >>> On Mon, Dec 11, 2017 at 03:35:02PM +0100, Christian Borntraeger wrote: >>>> >>>> >>>> On 12/11/2017 03:16 PM, Yury Norov wrote: >>>>> This benchmark sends many IPIs in different modes and measures >>>>> time for IPI delivery (first column), and total time, ie including >>>>> time to acknowledge the receive by sender (second column). >>>>> >>>>> The scenarios are: >>>>> Dry-run: do everything except actually sending IPI. Useful >>>>> to estimate system overhead. >>>>> Self-IPI: Send IPI to self CPU. >>>>> Normal IPI: Send IPI to some other CPU. >>>>> Broadcast IPI: Send broadcast IPI to all online CPUs. >>>>> >>>>> For virtualized guests, sending and reveiving IPIs causes guest exit. >>>>> I used this test to measure performance impact on KVM subsystem of >>>>> Christoffer Dall's series "Optimize KVM/ARM for VHE systems". >>>>> >>>>> https://www.spinics.net/lists/kvm/msg156755.html >>>>> >>>>> Test machine is ThunderX2, 112 online CPUs. Below the results normalized >>>>> to host dry-run time. Smaller - better. >>>>> >>>>> Host, v4.14: >>>>> Dry-run: 0 1 >>>>> Self-IPI: 9 18 >>>>> Normal IPI: 81 110 >>>>> Broadcast IPI: 0 2106 >>>>> >>>>> Guest, v4.14: >>>>> Dry-run: 0 1 >>>>> Self-IPI: 10 18 >>>>> Normal IPI: 305 525 >>>>> Broadcast IPI: 0 9729 >>>>> >>>>> Guest, v4.14 + VHE: >>>>> Dry-run: 0 1 >>>>> Self-IPI: 9 18 >>>>> Normal IPI: 176 343 >>>>> Broadcast IPI: 0 9885 >> [...] >>>>> +static int __init init_bench_ipi(void) >>>>> +{ >>>>> + ktime_t ipi, total; >>>>> + int ret; >>>>> + >>>>> + ret = bench_ipi(NTIMES, DRY_RUN, &ipi, &total); >>>>> + if (ret) >>>>> + pr_err("Dry-run FAILED: %d\n", ret); >>>>> + else >>>>> + pr_err("Dry-run: %18llu, %18llu ns\n", ipi, total); >>>> >>>> you do not use NTIMES here to calculate the average value. Is that intended? >>> >>> I think, it's more visually to represent all results in number of dry-run >>> times, like I did in patch description. So on kernel side I expose raw data >>> and calculate final values after finishing tests. >> >> I think it is highly confusing that the output from the patch description does not >> match the output from the real module. So can you make that match at least? > > I think so. That's why I noticed that results are normalized to host dry-run > time, even more, they are small and better for human perception. > > I was recommended not to public raw data, you'd understand. If this is > the blocker, I can post results from QEMU-hosted kernel. you could just post some example data from any random x86 laptop. I think it would just be good to have the patch description output match the real output.