Received: by 10.223.164.202 with SMTP id h10csp401081wrb; Tue, 14 Nov 2017 03:29:48 -0800 (PST) X-Google-Smtp-Source: AGs4zMZJlAVlW5opCS9gZMV1ywwsi5XnGYeR+FFkpKz32eM/+wWME1A3TrAJB0HHgXj6AbIXmDZ4 X-Received: by 10.84.234.199 with SMTP id i7mr11838542plt.15.1510658988734; Tue, 14 Nov 2017 03:29:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510658988; cv=none; d=google.com; s=arc-20160816; b=nzJZd3iJYM72+zr8z11AVwOujeEcc2Fav0jeX7mBBt2buf66wRNeS97J7tj+oYSybp laQlGgTp+hKMCKDDlLq/crjuwSE5mtCSUqdHDvuDmmFyDHhYXQFxq7exx7oIaoOAjrB/ LcXvOt6P3tapTrTNBItPUnp3xikqOUzFA5Q5Yx297IvBcw6fpdOhh+/zXYdDHwe5GisZ iT1QtOJ55ppQ8EpJxBUAeJ0wAKNzrWPRsCyzzD5gm4eJGSBCOCFXWnqsVeCf1Qqq/FfL dqq+/DtSRgKNYPE7HhUsCO13Gr4P3rRdosLHQy3jaNDNbkC4335+nn5MEmu0sR6EelKM x5mA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=QMalSr+jkhCii3vr/TbhmXUepwG1D42K78fb6Xe0oDs=; b=BfVWxSLMDOnSH3iltfwqt1LmKKEOLM9eGaOzquOkgrxDXlgY8uOkaGb+TQ89sgwmTa lIuLvQ4ylE8wpJEFhaX4hxjvjTaoIilRCmO80AauflxXBcZRmUsEzV410A5bEfvn1HbP 8S91J1sTOA9ZciJhLsSzyFJgGEd+OrEg5wyz0ieKHcQNPbRvKY9lmGltC5Vcpw0a7+Ou hv7qmYqiaagIw8kvwXUkA60WPqVu8j+pfvm2ii0MSDP5C6ub/iRddXjNqeHVC/yjZ/S5 9gFenvTo+gjPDfdRdqC9adiy/wfKC2FBfb6xeo5uvt/jpuiaxc3BXKEEJ5DHZFuobDNZ g74A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=XHT2lbS5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k14si6845583pln.223.2017.11.14.03.29.36; Tue, 14 Nov 2017 03:29:48 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=XHT2lbS5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754166AbdKNKXr (ORCPT + 87 others); Tue, 14 Nov 2017 05:23:47 -0500 Received: from mail-oi0-f68.google.com ([209.85.218.68]:46683 "EHLO mail-oi0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752319AbdKNKXj (ORCPT ); Tue, 14 Nov 2017 05:23:39 -0500 Received: by mail-oi0-f68.google.com with SMTP id n16so4250159oig.3; Tue, 14 Nov 2017 02:23:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=QMalSr+jkhCii3vr/TbhmXUepwG1D42K78fb6Xe0oDs=; b=XHT2lbS5N7cAQGrqznVacT9S5cTApo1nJYzPASDgiLwhysRsFsUj27lud70jnyRNRv 6pz6fz31YDFX30h4uTlZcOZcPf06wcYUTCuYNoLmspP+s3mKkl7DadERR71ENoVzq6ir r53AVHYT5K6H7rdTYxuDTLLQMev0wp15IktJ6yMP9zlJWtUR/SPdImu3mcP7bLHp+85a wMkADW+DViKhga7uLhWWIMs8EvaMIlneo+ddm9eaGMm/ATcNqjvSh8xG7tvMEhOMLBf6 KHPvwGFMEs/th22OsHOMuInpLOz6g1HNs9Xj8mK6yBT1fEtTLf+tAxzWS+HZ61SWL8wh DfxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=QMalSr+jkhCii3vr/TbhmXUepwG1D42K78fb6Xe0oDs=; b=CSEOBycUpS+dlh283rDWpEZrO1vWRv9aVYJ0XplTLYCnmwLIJAMcCNtyV+w6gBgUUL fU6WUP3VLeOyliPCBhnQgs3X1txWd2T9KqvHTWlKBElsIYpkRJKM/8EPpzEyFJIpUXSh eTfhN6OAtlelAXWvUJPKaKChtvo/+q+krenjliUrSbv9ufRE8bKLCTuxZpJl0TrSYdOm U5fU0g9tFNqDJKzEF4t9EKBvbpYBZLg8c6I1SilGCzPML8G+ZNjWM7tW58n1y5l8NVMg 3f4CR6cZDkQv+jYukiGlj/rTMBhAVu9wjZk02t0TbElPUTq0Qk/M6btc3AVcPFZsQmCK JBLg== X-Gm-Message-State: AJaThX46Mj8IHr2Z9wa/JkJNDKntThrX+DMzcQbpEveUHb+KSi0P6Gfq ORPAvphDrQYry+dGczw3XLo= X-Received: by 10.202.8.133 with SMTP id 127mr6838083oii.264.1510655018626; Tue, 14 Nov 2017 02:23:38 -0800 (PST) Received: from [0.0.0.0] ([47.89.242.186]) by smtp.gmail.com with ESMTPSA id j40sm8794834otj.57.2017.11.14.02.23.32 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 14 Nov 2017 02:23:38 -0800 (PST) Subject: Re: [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops To: Wanpeng Li Cc: Juergen Gross , Quan Xu , kvm , linux-doc@vger.kernel.org, "open list:FILESYSTEMS (VFS and infrastructure)" , "linux-kernel@vger.kernel.org" , virtualization@lists.linux-foundation.org, the arch/x86 maintainers , xen-devel , Yang Zhang , Alok Kataria , Rusty Russell , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" References: <1510567565-5118-1-git-send-email-quan.xu0@gmail.com> <1510567565-5118-2-git-send-email-quan.xu0@gmail.com> <07fac696-e3d4-8f35-8f3d-764d7ab41204@suse.com> <902da704-1e4f-583b-91c3-1a62ccd6e73d@gmail.com> From: Quan Xu Message-ID: Date: Tue, 14 Nov 2017 18:23:30 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2017/11/14 16:22, Wanpeng Li wrote: > 2017-11-14 16:15 GMT+08:00 Quan Xu : >> >> On 2017/11/14 15:12, Wanpeng Li wrote: >>> 2017-11-14 15:02 GMT+08:00 Quan Xu : >>>> >>>> On 2017/11/13 18:53, Juergen Gross wrote: >>>>> On 13/11/17 11:06, Quan Xu wrote: >>>>>> From: Quan Xu >>>>>> >>>>>> So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called >>>>>> in idle path which will poll for a while before we enter the real idle >>>>>> state. >>>>>> >>>>>> In virtualization, idle path includes several heavy operations >>>>>> includes timer access(LAPIC timer or TSC deadline timer) which will >>>>>> hurt performance especially for latency intensive workload like message >>>>>> passing task. The cost is mainly from the vmexit which is a hardware >>>>>> context switch between virtual machine and hypervisor. Our solution is >>>>>> to poll for a while and do not enter real idle path if we can get the >>>>>> schedule event during polling. >>>>>> >>>>>> Poll may cause the CPU waste so we adopt a smart polling mechanism to >>>>>> reduce the useless poll. >>>>>> >>>>>> Signed-off-by: Yang Zhang >>>>>> Signed-off-by: Quan Xu >>>>>> Cc: Juergen Gross >>>>>> Cc: Alok Kataria >>>>>> Cc: Rusty Russell >>>>>> Cc: Thomas Gleixner >>>>>> Cc: Ingo Molnar >>>>>> Cc: "H. Peter Anvin" >>>>>> Cc: x86@kernel.org >>>>>> Cc: virtualization@lists.linux-foundation.org >>>>>> Cc: linux-kernel@vger.kernel.org >>>>>> Cc: xen-devel@lists.xenproject.org >>>>> Hmm, is the idle entry path really so critical to performance that a new >>>>> pvops function is necessary? >>>> Juergen, Here is the data we get when running benchmark netperf: >>>> 1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0): >>>> 29031.6 bit/s -- 76.1 %CPU >>>> >>>> 2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0): >>>> 35787.7 bit/s -- 129.4 %CPU >>>> >>>> 3. w/ kvm dynamic poll: >>>> 35735.6 bit/s -- 200.0 %CPU >>> Actually we can reduce the CPU utilization by sleeping a period of >>> time as what has already been done in the poll logic of IO subsystem, >>> then we can improve the algorithm in kvm instead of introduing another >>> duplicate one in the kvm guest. >> We really appreciate upstream's kvm dynamic poll mechanism, which is >> really helpful for a lot of scenario.. >> >> However, as description said, in virtualization, idle path includes >> several heavy operations includes timer access (LAPIC timer or TSC >> deadline timer) which will hurt performance especially for latency >> intensive workload like message passing task. The cost is mainly from >> the vmexit which is a hardware context switch between virtual machine >> and hypervisor. >> >> for upstream's kvm dynamic poll mechanism, even you could provide a >> better algorism, how could you bypass timer access (LAPIC timer or TSC >> deadline timer), or a hardware context switch between virtual machine >> and hypervisor. I know these is a tradeoff. >> >> Furthermore, here is the data we get when running benchmark contextswitch >> to measure the latency(lower is better): >> >> 1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0): >> 3402.9 ns/ctxsw -- 199.8 %CPU >> >> 2. w/ patch and disable kvm dynamic poll: >> 1163.5 ns/ctxsw -- 205.5 %CPU >> >> 3. w/ kvm dynamic poll: >> 2280.6 ns/ctxsw -- 199.5 %CPU >> >> so, these tow solution are quite similar, but not duplicate.. >> >> that's also why to add a generic idle poll before enter real idle path. >> When a reschedule event is pending, we can bypass the real idle path. >> > There is a similar logic in the idle governor/driver, so how this > patchset influence the decision in the idle governor/driver when > running on bare-metal(power managment is not exposed to the guest so > we will not enter into idle driver in the guest)? > This is expected to take effect only when running as a virtual machine with proper CONFIG_* enabled. This can not work on bare mental even with proper CONFIG_* enabled. Quan Alibaba Cloud From 1584037120632593213@xxx Tue Nov 14 10:31:58 +0000 2017 X-GM-THRID: 1583947963447418004 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread