Received: by 10.223.164.202 with SMTP id h10csp904686wrb; Fri, 17 Nov 2017 10:31:08 -0800 (PST) X-Google-Smtp-Source: AGs4zMaLh2ANgFbd7af8w7m4ZRp852GlKdnaCYQ4qN5Q3fOsgZfSBZvgDz5eNG88M/RDu22am6sv X-Received: by 10.84.174.4 with SMTP id q4mr6200457plb.233.1510943468302; Fri, 17 Nov 2017 10:31:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510943468; cv=none; d=google.com; s=arc-20160816; b=SYmolCozZo9Be1MG4S9zGoN8ec/Z98TvUx4buw+QbTY5zQLkWvVd5sUlO4MMx6SRVr iL/Qy/9U1appUfoCe4fO2w/VP5BKiQek7b886SedE7uLnSF/5gG27y/BmjNptNcXFuqX zqNAsW+Z1Flqt4XeKFDLgU4M+aUs7ZGVn6xOb9vbL7KSUNvHLiYh7F9Dw6xCyQbLGuuy YuRA6tb5n9eTxOaFQedcyOXbHM9aLYJFtpx+3Whfap53GOBnIU8gYrf5vej0h5iE0cY2 Rw9bxPrBGu+vy3s4bzgkbXdlLdNIhBqrQ5GqknY8jEKT/BgGNCV1sKMeGnZ5siO5Vfv5 nz2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=2jndQT4aa4b3GkwHK0Q4sUMPGKoyHts0Y1yYAx8fTr8=; b=eS2v3fvE/P6bcnRywwXeq317oz1gpw4ZsoEPgM5s+Hp8an3ralsX3FkicvdPNU/0Yn Cjxyv1sowUaY9SHBfxsU2NKhci9NwwNQ/Sas+l/s7lhy2sN0FZosbNbXUHJ/vnGw/4+o 6tu/OVBvhycRA0c/51SpQcW3DIdq/6I8ySeaRF/7hfiGPd7n+1eOU94Dwyt+x/rhPrdU lHAowQONCMAnxhAvn+QKpHdv3+yrgH6OSOahYIMqYjXceGfg4J4YPUUlsL5WsdZ0PaWg wx/STBbYuAwWYM4KUkIJ6O/xdXD6Bi0AdZzB3b1y/4/SKvP65iX9REhfkVqu5yC2HKwJ 4XSw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=j+/JoWGU; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y9si3048019pgp.405.2017.11.17.10.30.54; Fri, 17 Nov 2017 10:31:08 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=j+/JoWGU; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753663AbdKQLYF (ORCPT + 92 others); Fri, 17 Nov 2017 06:24:05 -0500 Received: from mail-oi0-f67.google.com ([209.85.218.67]:39315 "EHLO mail-oi0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753528AbdKQLX5 (ORCPT ); Fri, 17 Nov 2017 06:23:57 -0500 Received: by mail-oi0-f67.google.com with SMTP id r190so1408818oie.6; Fri, 17 Nov 2017 03:23:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=2jndQT4aa4b3GkwHK0Q4sUMPGKoyHts0Y1yYAx8fTr8=; b=j+/JoWGUYxhlrrOBJDoEHaf5wnqUxXhL0gv2GmMQ2dBPBMaV35jv5NT2W/nzY7akIS azqBErFs9EmHajZXzB6fFwRffoEYwbesEK9CtFmYDRWK/nPUXIzig7NcEe4Z9OGh8Y8g +tE4GvQYb0MmgBiPNcbq0nbyl9WLQkwOQFgZZfl87CropVkJOvVkJB8ABuPrB79IXy3l TKjpUo2S5XjeHlF8S1Rcao2ni1AL0ORXQ0w5/E7xxbNb44viy+0O5/CJAtM6wJ4iOCIh mW+cLF4axdBuoJ+YiviQUWR0N9ZyTPBAmyRfUMlak+aPImG07Ph1lWbwxkR02gmUR7+N Ikhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=2jndQT4aa4b3GkwHK0Q4sUMPGKoyHts0Y1yYAx8fTr8=; b=RHtZb1JP/DL1r8UPqYFrCopZYe9Fdmu16A3QZM5pV3n0l8EemmF8iZDDLfXp7dqGkG Ta+IWktXRm+A3H9zpn+8JYDEPTzcw1o5bbR9R8+4gvJ3wwB5/UniwpXwU78lVLVQQinp 6ukeFwrp746fQ5/skiQKzuaDJvTOVCnuKsmEqZRiyRfsXg6ywi/LtnJtNWyFHOMPbgm5 KdIujo5i6oIKHjN0fTCw4bxbLR/UTbftlvHqmdRkAjozTXUBM8OycEFImUIc+Befwpyb WInyy8QL7pOGHGImjBFMhPIm/PV9DQuXSAouAcHtc8To0M29Ya45QWs9hchaMTkmQRY6 Y9tw== X-Gm-Message-State: AJaThX6H9JEicuOVibrhwd9AexHXj+oYq1MtwtuXZy80aH0iFHTwYsnd 27dQXKpskQCqHuc4nvBwADA= X-Received: by 10.202.69.10 with SMTP id s10mr961089oia.23.1510917836449; Fri, 17 Nov 2017 03:23:56 -0800 (PST) Received: from [0.0.0.0] ([47.89.242.186]) by smtp.gmail.com with ESMTPSA id s101sm1541830ota.17.2017.11.17.03.23.47 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 17 Nov 2017 03:23:56 -0800 (PST) Subject: Re: [PATCH RFC v3 3/6] sched/idle: Add a generic poll before enter real idle path To: Thomas Gleixner , Peter Zijlstra Cc: Quan Xu , kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, LKML , virtualization@lists.linux-foundation.org, x86@kernel.org, xen-devel@lists.xenproject.org, Yang Zhang , Ingo Molnar , "H. Peter Anvin" , Borislav Petkov , Kyle Huey , Len Brown , Andy Lutomirski , Tom Lendacky , Tobias Klauser , Daniel Lezcano References: <1510567565-5118-1-git-send-email-quan.xu0@gmail.com> <1510567565-5118-4-git-send-email-quan.xu0@gmail.com> <20171115121152.gqug5wzerlo3eimd@hirez.programming.kicks-ass.net> <46086489-5a01-16e1-9314-70ae53c01952@gmail.com> From: Quan Xu Message-ID: <564b8a6e-8ddd-4e3d-c670-10f1697e6c06@gmail.com> Date: Fri, 17 Nov 2017 19:23:43 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2017-11-16 17:53, Thomas Gleixner wrote: > On Thu, 16 Nov 2017, Quan Xu wrote: >> On 2017-11-16 06:03, Thomas Gleixner wrote: >> --- a/drivers/cpuidle/cpuidle.c >> +++ b/drivers/cpuidle/cpuidle.c >> @@ -210,6 +210,13 @@ int cpuidle_enter_state(struct cpuidle_device *dev, >> struct cpuidle_driver *drv, >>                 target_state = &drv->states[index]; >>         } >> >> +#ifdef CONFIG_PARAVIRT >> +       paravirt_idle_poll(); >> + >> +       if (need_resched()) >> +               return -EBUSY; >> +#endif > That's just plain wrong. We don't want to see any of this PARAVIRT crap in > anything outside the architecture/hypervisor interfacing code which really > needs it. > > The problem can and must be solved at the generic level in the first place > to gather the data which can be used to make such decisions. > > How that information is used might be either completely generic or requires > system specific variants. But as long as we don't have any information at > all we cannot discuss that. > > Please sit down and write up which data needs to be considered to make > decisions about probabilistic polling. Then we need to compare and contrast > that with the data which is necessary to make power/idle state decisions. > > I would be very surprised if this data would not overlap by at least 90%. > Peter, tglx Thanks for your comments.. rethink of this patch set, 1. which data needs to considerd to make decisions about probabilistic polling I really need to write up which data needs to considerd to make decisions about probabilistic polling. At last several months, I always focused on the data _from idle to reschedule_, then to bypass the idle loops. unfortunately, this makes me touch scheduler/idle/nohz code inevitably. with tglx's suggestion, the data which is necessary to make power/idle state decisions, is the last idle state's residency time. IIUC this data is duration from idle to wakeup, which maybe by reschedule irq or other irq. I also test that the reschedule irq overlap by more than 90% (trace the need_resched status after cpuidle_idle_call), when I run ctxsw/netperf for one minute. as the overlap, I think I can input the last idle state's residency time to make decisions about probabilistic polling, as @dev->last_residency does. it is much easier to get data. 2. do a HV specific idle driver (function) so far, power management is not exposed to guest.. idle is simple for KVM guest, calling "sti" / "hlt"(cpuidle_idle_call() --> default_idle_call()).. thanks Xen guys, who has implemented the paravirt framework. I can implement it as easy as following:              --- a/arch/x86/kernel/kvm.c              +++ b/arch/x86/kernel/kvm.c              @@ -465,6 +465,12 @@ static void __init kvm_apf_trap_init(void)                      update_intr_gate(X86_TRAP_PF, async_page_fault);               }              +static __cpuidle void kvm_safe_halt(void)              +{          +        /* 1. POLL, if need_resched() --> return */          +              +        asm volatile("sti; hlt": : :"memory"); /* 2. halt */              +          +        /* 3. get the last idle state's residency time */              +          +        /* 4. update poll duration based on last idle state's residency time */              +}              +               void __init kvm_guest_init(void)               {                      int i;              @@ -490,6 +496,8 @@ void __init kvm_guest_init(void)                      if (kvmclock_vsyscall)                              kvm_setup_vsyscall_timeinfo();              +       pv_irq_ops.safe_halt = kvm_safe_halt;              +               #ifdef CONFIG_SMP then, I am no need to introduce a new pvops, and never modify schedule/idle/nohz code again. also I can narrow all of the code down in arch/x86/kernel/kvm.c. If this is in the right direction, I will send a new patch set next week.. thanks, Quan Alibaba Cloud From 1584220916094824742@xxx Thu Nov 16 11:13:19 +0000 2017 X-GM-THRID: 1584141070007959176 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread