Received: by 10.223.164.202 with SMTP id h10csp243404wrb; Tue, 14 Nov 2017 00:18:16 -0800 (PST) X-Google-Smtp-Source: AGs4zMZCpv1N2XX9BPvZXJU40iRvw9NhlC0muuf00sgyCx6oF58W+ECwQkfq91E/7DLLnMjMlHQ1 X-Received: by 10.84.217.80 with SMTP id e16mr2229907plj.373.1510647496633; Tue, 14 Nov 2017 00:18:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510647496; cv=none; d=google.com; s=arc-20160816; b=O94aup/TXqqn7D8krg/WkFz+VKsVK+mp+bcWa4EEzIH990nmMslE+9j0ez+qxW4AvR cOAlOMOkr6aWbTxTFJCmyuPhF2E9+fyeMpwP4J/QtVj5d7+8LQhQLF7fBejRUe7weHE/ CcrgBiOuws6MC52RGXEUZHdwcboENLf+AWYb6XgWZPORsivimqrs+OOaYIHGsaBvyW2K MxjwlDsdgBQKcdXoVnsA5nwuq6efGFoEzBk+niWPZvsE9ogPbfujDPj2ZutYU/pRQ8Q6 kGEE5jktFhgBcN46B527NrT3Qbur3AN/8i31SmxHPHRzYGKgjmqYBm0aoGEc5/XZz1QV X0QQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=muUUnzFLsCmJob9dL01DuHe1Xs/4P5P92FJkm5jLOic=; b=pdd0bIVRdQ/7ievdwKAcf9h5n/fjf1tGa2ASgnd4IAUHRjc6JabjO0n2TRUSA2syg/ F+lcDhPpiXo+/h7i+ALRTL/IM7uxUU6OkyiiFsusRaZRlULrlt4/n+yNaq3n2G0coRMr Wc9t30BVE6hOEEvamHJo2GfplQ/ZkJC1GjZRyu8uRt8ejYamobkSfCMAT2xibEpcAaZR h28J5WESMzV9HtMQTQMxWmQtlZU7lUgWot+M+x69VpVSMYFp2mc7WCAywUTA/0Plk5An XbH2YMtqbClaGFkq3CgTITFNSdvsgE1U2seA4znj/HvEEy3lco9L6JWbW0Lh2756PlGq Mt7w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=KQUkhAFO; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n123si15445185pgn.35.2017.11.14.00.18.03; Tue, 14 Nov 2017 00:18:16 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=KQUkhAFO; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753147AbdKNIP5 (ORCPT + 89 others); Tue, 14 Nov 2017 03:15:57 -0500 Received: from mail-ot0-f194.google.com ([74.125.82.194]:43754 "EHLO mail-ot0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751114AbdKNIPt (ORCPT ); Tue, 14 Nov 2017 03:15:49 -0500 Received: by mail-ot0-f194.google.com with SMTP id s12so9582844otc.0; Tue, 14 Nov 2017 00:15:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=muUUnzFLsCmJob9dL01DuHe1Xs/4P5P92FJkm5jLOic=; b=KQUkhAFObslClYGf53rZnyybEiC4stisKI7EGOn18Oiy/jWemT+mDB/tx9edK8/B3o Fk4VBqMQi6oHFz9cHy8xdjr/VBf8cPm+Z0cqcik1Q0KyIFtuaa1mrWl/j3PR1crtMLKZ QIz/CB7+0tLAHJqCWjIM2iRG4LHj7tX4LYw2V8jVZlgtu5lgYD/zlA10fJCRnzIl4XiN teB+7ALF8y7iIFEO9x8UcqsGZ9YXEMy8XGxt2Xa3Rsl2lQn51r01GqTw3UdceMV4TyKD sVdFuyXVEIy58CpHSEpmIGYRToFglYbge9wqxQ3QWr7XlH5lmMb3Ypcd7fVp2yiFPTGi hsWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=muUUnzFLsCmJob9dL01DuHe1Xs/4P5P92FJkm5jLOic=; b=IGv45OuylU5Ipgm0juwwD1FdAEGCc/Ja4Asew19JuDdfNU8EkbU9lOcH/wUACJjOpC gwpJt6Hhpf2I3uFUhWMPxN4t6vEzTQvDlsGZ/vtD2CH8xj8GtMyDHdmXQmfn40lmI14I RhOHUNn2N5Bq2UnVTXCR3yQQgDfbK9SbupGUDYSFt92IuL468J7oxH0ZRTrHlTizANmv k/MwMc/+VrfwYZNriSTfj16xlSKOoy4Juq85Xo7JmNMeJstYbMAil2qV3hl5CxdMwpGN eBG4+vIjxJsjrvE8bdBz5YyrYgxZUJnp6Oj3Gjp2cFD2QqKY0rU6GrXwoW0AdPkLf/Cw wQzQ== X-Gm-Message-State: AJaThX56odT1sye2zJzpX/+Tn2HCmzaBLR9W6xWudECQfAA4bW+ZDpdW fINNc6coFY3QOZyw7yAi3rM= X-Received: by 10.157.8.75 with SMTP id 69mr1049004oty.188.1510647348437; Tue, 14 Nov 2017 00:15:48 -0800 (PST) Received: from [0.0.0.0] ([47.89.242.186]) by smtp.gmail.com with ESMTPSA id j9sm8583244oth.74.2017.11.14.00.15.42 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 14 Nov 2017 00:15:48 -0800 (PST) Subject: Re: [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops To: Wanpeng Li Cc: Juergen Gross , Quan Xu , kvm , linux-doc@vger.kernel.org, "open list:FILESYSTEMS (VFS and infrastructure)" , "linux-kernel@vger.kernel.org" , virtualization@lists.linux-foundation.org, the arch/x86 maintainers , xen-devel , Yang Zhang , Alok Kataria , Rusty Russell , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" References: <1510567565-5118-1-git-send-email-quan.xu0@gmail.com> <1510567565-5118-2-git-send-email-quan.xu0@gmail.com> <07fac696-e3d4-8f35-8f3d-764d7ab41204@suse.com> <902da704-1e4f-583b-91c3-1a62ccd6e73d@gmail.com> From: Quan Xu Message-ID: Date: Tue, 14 Nov 2017 16:15:39 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2017/11/14 15:12, Wanpeng Li wrote: > 2017-11-14 15:02 GMT+08:00 Quan Xu : >> >> On 2017/11/13 18:53, Juergen Gross wrote: >>> On 13/11/17 11:06, Quan Xu wrote: >>>> From: Quan Xu >>>> >>>> So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called >>>> in idle path which will poll for a while before we enter the real idle >>>> state. >>>> >>>> In virtualization, idle path includes several heavy operations >>>> includes timer access(LAPIC timer or TSC deadline timer) which will >>>> hurt performance especially for latency intensive workload like message >>>> passing task. The cost is mainly from the vmexit which is a hardware >>>> context switch between virtual machine and hypervisor. Our solution is >>>> to poll for a while and do not enter real idle path if we can get the >>>> schedule event during polling. >>>> >>>> Poll may cause the CPU waste so we adopt a smart polling mechanism to >>>> reduce the useless poll. >>>> >>>> Signed-off-by: Yang Zhang >>>> Signed-off-by: Quan Xu >>>> Cc: Juergen Gross >>>> Cc: Alok Kataria >>>> Cc: Rusty Russell >>>> Cc: Thomas Gleixner >>>> Cc: Ingo Molnar >>>> Cc: "H. Peter Anvin" >>>> Cc: x86@kernel.org >>>> Cc: virtualization@lists.linux-foundation.org >>>> Cc: linux-kernel@vger.kernel.org >>>> Cc: xen-devel@lists.xenproject.org >>> Hmm, is the idle entry path really so critical to performance that a new >>> pvops function is necessary? >> Juergen, Here is the data we get when running benchmark netperf: >> 1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0): >> 29031.6 bit/s -- 76.1 %CPU >> >> 2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0): >> 35787.7 bit/s -- 129.4 %CPU >> >> 3. w/ kvm dynamic poll: >> 35735.6 bit/s -- 200.0 %CPU > Actually we can reduce the CPU utilization by sleeping a period of > time as what has already been done in the poll logic of IO subsystem, > then we can improve the algorithm in kvm instead of introduing another > duplicate one in the kvm guest. We really appreciate upstream's kvm dynamic poll mechanism, which is really helpful for a lot of scenario.. However, as description said, in virtualization, idle path includes several heavy operations includes timer access (LAPIC timer or TSC deadline timer) which will hurt performance especially for latency intensive workload like message passing task. The cost is mainly from the vmexit which is a hardware context switch between virtual machine and hypervisor. for upstream's kvm dynamic poll mechanism, even you could provide a better algorism, how could you bypass timer access (LAPIC timer or TSC deadline timer), or a hardware context switch between virtual machine and hypervisor. I know these is a tradeoff. Furthermore, here is the data we get when running benchmark contextswitch to measure the latency(lower is better): 1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):   3402.9 ns/ctxsw -- 199.8 %CPU 2. w/ patch and disable kvm dynamic poll:   1163.5 ns/ctxsw -- 205.5 %CPU 3. w/ kvm dynamic poll:   2280.6 ns/ctxsw -- 199.5 %CPU so, these tow solution are quite similar, but not duplicate.. that's also why to add a generic idle poll before enter real idle path. When a reschedule event is pending, we can bypass the real idle path. Quan Alibaba Cloud > Regards, > Wanpeng Li > >> 4. w/patch and w/ kvm dynamic poll: >> 42225.3 bit/s -- 198.7 %CPU >> >> 5. idle=poll >> 37081.7 bit/s -- 998.1 %CPU >> >> >> >> w/ this patch, we will improve performance by 23%.. even we could improve >> performance by 45.4%, if we use w/patch and w/ kvm dynamic poll. also the >> cost of CPU is much lower than 'idle=poll' case.. >> >>> Wouldn't a function pointer, maybe guarded >>> by a static key, be enough? A further advantage would be that this would >>> work on other architectures, too. >> >> I assume this feature will be ported to other archs.. a new pvops makes code >> clean and easy to maintain. also I tried to add it into existed pvops, but >> it >> doesn't match. >> >> >> >> Quan >> Alibaba Cloud >>> >>> Juergen >>> From 1584027504485254943@xxx Tue Nov 14 07:59:07 +0000 2017 X-GM-THRID: 1583947963447418004 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread