Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp2090520yba; Sun, 21 Apr 2019 23:36:34 -0700 (PDT) X-Google-Smtp-Source: APXvYqw78nJEJW8L82dxnEIPf7Uko7kmrqrk+sgisML8VoJ5r7+i1RrenAdRiepvSOSpY7yuHHP2 X-Received: by 2002:a65:5cc8:: with SMTP id b8mr16806361pgt.36.1555914994211; Sun, 21 Apr 2019 23:36:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555914994; cv=none; d=google.com; s=arc-20160816; b=HjJcHItDQemj2WR/4y1urWJQLr27q2AsVqUm0FHAEIJBiwfq1+C/xnQYnLbWUsgfti lZLUBXJ3uFPKXjiYQJKVq9LFyHkHz2Su/qI7bXhEGWwxP3fSF0OxGlzDKU4HwSjFdByQ lNvHaDai7II7jQwNUzvksy73CGvc3EsH7zs+FkrjhgDjxd8/QBFO1MU72+xhgv5wtYKq gJUGda/48r22JIwCmfW5DZJ+weHb8eb19BhX/e2vJubO1fGSAyG+uCTHNvu2vZdvcgCj lGOb7lrSGy0VpODgGm26mS/vg1EL/ZqSNBRrrJ8r/98RTLUE4gBNZarYYMQ0auHBCo8b 46bg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from; bh=UPJ2FKRC8huM+Ez3P6z1m//AuroS+WaJ4kvNlPt6Vzc=; b=r/GyleBce25VVxVWg2SXWBEFhFRC6SFY09jLQG2XZ8UAT5LBYcPYjm7O49F4FEXv3C bcjhFRLTyBCPseEN3fdWUmo8BQX6kZ/49BPMEkopmMRFyEKp4b5KLV7TTbe2kZvAt7k3 VE2PAA0HHms/KIOz6et9EPquDE52KcW1kxUX1q69zG5eqX6cSIWtRmQLGSeMzZuDWRUb FFQkVeX1QhZ372jHi+u2JzKm2ZAs75lMkCUEu1SHwXpW8GQ+evKImNlHhEshqGvYUrmG RlhSj8/IljRpBUWwF8TzDwvkvRZWjMUInCUms/u5jHOA9YTNcCy57NbVyC2e9q5DKu6i iH5g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m15si11866137pgj.126.2019.04.21.23.35.54; Sun, 21 Apr 2019 23:36:34 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726665AbfDVGeV (ORCPT + 99 others); Mon, 22 Apr 2019 02:34:21 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:37258 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726483AbfDVGeU (ORCPT ); Mon, 22 Apr 2019 02:34:20 -0400 Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x3M6XiKB007275 for ; Mon, 22 Apr 2019 02:34:19 -0400 Received: from e06smtp01.uk.ibm.com (e06smtp01.uk.ibm.com [195.75.94.97]) by mx0a-001b2d01.pphosted.com with ESMTP id 2s17342gfh-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 22 Apr 2019 02:34:19 -0400 Received: from localhost by e06smtp01.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 22 Apr 2019 07:34:16 +0100 Received: from b06cxnps4075.portsmouth.uk.ibm.com (9.149.109.197) by e06smtp01.uk.ibm.com (192.168.101.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Mon, 22 Apr 2019 07:34:14 +0100 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x3M6YDTP55247052 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 22 Apr 2019 06:34:13 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 74BCB5204E; Mon, 22 Apr 2019 06:34:13 +0000 (GMT) Received: from boston16h.aus.stglabs.ibm.com (unknown [9.3.23.78]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 1829D52059; Mon, 22 Apr 2019 06:34:11 +0000 (GMT) From: Abhishek Goel To: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-pm@vger.kernel.org Cc: rjw@rjwysocki.net, daniel.lezcano@linaro.org, mpe@ellerman.id.au, ego@linux.vnet.ibm.com, dja@axtens.net, Abhishek Goel Subject: [PATCH 0/1] Forced-wakeup for stop lite states on Powernv Date: Mon, 22 Apr 2019 01:32:30 -0500 X-Mailer: git-send-email 2.17.1 X-TM-AS-GCONF: 00 x-cbid: 19042206-4275-0000-0000-0000032A459E X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19042206-4276-0000-0000-0000383986BE Message-Id: <20190422063231.51043-1-huntbag@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-04-21_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=870 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1904220049 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Currently, the cpuidle governors determine what idle state a idling CPU should enter into based on heuristics that depend on the idle history on that CPU. Given that no predictive heuristic is perfect, there are cases where the governor predicts a shallow idle state, hoping that the CPU will be busy soon. However, if no new workload is scheduled on that CPU in the near future, the CPU will end up in the shallow state. Motivation ---------- In case of POWER, this is problematic, when the predicted state in the aforementioned scenario is a lite stop state, as such lite states will inhibit SMT folding, thereby depriving the other threads in the core from using the core resources. So we do not want to get stucked in such states for longer duration. To address this, the cpuidle-core can queue timer to correspond with the residency value of the next available state. This timer will forcefully wakeup the cpu. Few such iterations will essentially train the governor to select a deeper state for that cpu, as the timer here corresponds to the next available cpuidle state residency. Cpu will be kicked out of the lite state and end up in a non-lite state. Experiment ---------- I performed experiments for three scenarios to collect some data. case 1 : Without this patch and without tick retained, i.e. in a upstream kernel, It would spend more than even a second to get out of stop0_lite. case 2 : With tick retained in a upstream kernel - Generally, we have a sched tick at 4ms(CONF_HZ = 250). Ideally I expected it to take 8 sched tick to get out of stop0_lite. Experimentally, observation was ========================================================= sample min max 99percentile 20 4ms 12ms 4ms ========================================================= It would take atleast one sched tick to get out of stop0_lite. case 2 : With this patch (not stopping tick, but explicitly queuing a timer) ============================================================ sample min max 99percentile ============================================================ 20 144us 192us 144us ============================================================ In this patch, we queue a timer just before entering into a stop0_lite state. The timer fires at (residency of next available state + exit latency of next available state * 2). Let's say if next state(stop0) is available which has residency of 20us, it should get out in as low as (20+2*2)*8 [Based on the forumla (residency + 2xlatency)*history length] microseconds = 192us. Ideally we would expect 8 iterations, it was observed to get out in 6-7 iterations. Even if let's say stop2 is next available state(stop0 and stop1 both are unavailable), it would take (100+2*10)*8 = 960us to get into stop2. So, We are able to get out of stop0_lite generally in 150us(with this patch) as compared to 4ms(with tick retained). As stated earlier, we do not want to get stuck into stop0_lite as it inhibits SMT folding for other sibling threads, depriving them of core resources. Current patch is using forced-wakeup only for stop0_lite, as it gives performance benefit(primary reason) along with lowering down power consumption. We may extend this model for other states in future. Abhishek Goel (1): cpuidle-powernv : forced wakeup for stop lite states arch/powerpc/include/asm/opal-api.h | 1 + drivers/cpuidle/cpuidle-powernv.c | 71 ++++++++++++++++++++++++++++- 2 files changed, 71 insertions(+), 1 deletion(-) -- 2.17.1