Received: by 2002:ab2:6309:0:b0:1fb:d597:ff75 with SMTP id s9csp320024lqt; Thu, 6 Jun 2024 04:54:46 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVc9Vp7uTflmn4W0X0LPkzh4Eh3/AJEqD51kwDswUQxWJqQRnnSqysF1O3oBe/6PUQjl7z9loe/KU8ZpETnfMCdua3fHAF/XW8Jt7WebQ== X-Google-Smtp-Source: AGHT+IGh78wVDyTdT7F8z8vRcMbvCT0piXqcTUJqgpPrd31PgjpPZxaJj2nE+STf482YS8g9R43/ X-Received: by 2002:a05:6a20:9144:b0:1b1:fc7b:bc9e with SMTP id adf61e73a8af0-1b2b6b054f2mr6544849637.0.1717674885829; Thu, 06 Jun 2024 04:54:45 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717674885; cv=pass; d=google.com; s=arc-20160816; b=hTs6nhYeHO1uWVfvH7qbr+P/fQkKplgLxXOCU+J9n9fDz8Gbl4bW8ecfdUD6//glPu QpjJucbQpbke/AGio+YEWLwa26FJZVQ+NTDMff5SsArHN32kYf2ghd+2HYkD5mp3uzuS zRIXENfKjggTDEMfz3miNjf3f2+IPRm+8lqRNUy3WAKzMWoaT1uQyIwp1mHtLMB6Og2B 5dq3T4EngLSl+QvaaA/seFZIwawSVHvw79AmrumsRVTMKQty4LC4MSSsukDFKY2iIZSJ uc1kP6X1G8/4D9RBmchiaVItis/WPfDJT8mIQx8usHwMkM6GsOlBML5S/cp7tG+0og8+ m+NA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id; bh=iWNnLl7+63SmS+VCqsStvuZgmQ48UgFRrVuUO7kARU8=; fh=Pj9u/hrd72QixEpBLPGHlKil2JoKBAaJXQcmzfTFfe0=; b=Vw1XsPLffIYBmQBJ44uyeAu7rCulEY0rhXHUCC9/CZLQQI3/py5vQ3Fd6KKVNOl1lG kIfugWiUkIpaODKR0hWtEvSWvxz9VcMewSV130KGK3i2WX+NJ0uXTzsSpiYMkVsr6R0B QqfAh2anog/DIcduFcJ5f+mn19taRsswWVkyLU6FuW8XDUVE+N14WyejYN7H8McZQUTi g5G6WBGVVZkG7iKxpkHEKOfYjJijW/syG0WmFvVM7rCr0HSxOWzF+qDz1leADNKLISLu VzR/he3xlasctsC2IMxBrNYzpHgj/d9uGfhlVrqxOhvKDNk8RIWQWFZGDLr/fciXWkF3 WcQg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-204206-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-204206-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id d9443c01a7336-1f6bd7f322bsi10486175ad.452.2024.06.06.04.54.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 04:54:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-204206-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-204206-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-204206-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 54211288AD3 for ; Thu, 6 Jun 2024 11:54:45 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2B79C195967; Thu, 6 Jun 2024 11:54:39 +0000 (UTC) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 267DC13D28C; Thu, 6 Jun 2024 11:54:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717674878; cv=none; b=U9Ck8nfyvWV3tcp9ODWj0WCdyHsLwfs5ynonQYdtCb46BFDLsQMohm/DzdfsvmKZiUyd4NCTRHNFzBhEquG170KvvXL0xpttVA8FyYqVlf2I+enVkKi+P5Gw+tCh7XVicfkxWnU/rnRAWCA53mhQyLpcJU3379e6tpgJ/wbiFXI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717674878; c=relaxed/simple; bh=N92kaXVmF3fufNiwIfVbcm1oBKaXy1S33HJtdH0cBJs=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=lCrWpIUzl/4/VOeMKgN2P7Xvqmt3budBj/oHL7gaTCEUu5ARDqJKoU4vfgqWup3sCLUwY9Mkot88Ye9iyih6zYQ/O410MyJb8v6359Be8C5UgE82LlPb9MWJC2YkRQO23BJgAIhgmEgf9x3emHwNHG0KJwl/s9VEy4Sr1J6i34Y= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CDA662F4; Thu, 6 Jun 2024 04:54:59 -0700 (PDT) Received: from [10.1.28.63] (e127648.arm.com [10.1.28.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 24B3C3F64C; Thu, 6 Jun 2024 04:54:32 -0700 (PDT) Message-ID: Date: Thu, 6 Jun 2024 12:54:31 +0100 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 0/6] cpuidle: teo: fixes and improvements To: linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, rafael@kernel.org Cc: vincent.guittot@linaro.org, qyousef@layalina.io, peterz@infradead.org, daniel.lezcano@linaro.org, anna-maria@linutronix.de, kajetan.puchalski@arm.com, lukasz.luba@arm.com, dietmar.eggemann@arm.com References: <20240606090050.327614-1-christian.loehle@arm.com> Content-Language: en-US From: Christian Loehle In-Reply-To: <20240606090050.327614-1-christian.loehle@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 6/6/24 10:00, Christian Loehle wrote: > Hi all, > so my investigation into teo lead to the following fixes and > improvements. Logically they are mostly independent, that's why this > cover letter is quite short, details are in the patches. > > 1/6: > As discussed, the utilization threshold is too high, while > there are benefits in certain workloads, there are quite a few > regressions, too. > 2/6: > Especially with the new util threshold, stopping tick makes little > sense when utilized is detected, so don't. > 3/6: > Particularly with WFI, even if it's the only state, stopping the tick > has benefits, so enable that in the early bail out. > 4/6: > Stopping the tick with 0 cost (if the idle state dictates it) is too > aggressive IMO, so add 1ms constant cost. > XXX: This has the issue of now being counted as idle_miss, so we could > consider adding this to the states, too, but the simple implementation > of this would have the downside that the cost is added to deeper states > even if the tick is already off. > 5/6: > Remove the 'recent' intercept logic, see my findings in: > https://lore.kernel.org/lkml/0ce2d536-1125-4df8-9a5b-0d5e389cd8af@arm.com/ > I haven't found a way to salvage this properly, so I removed it. > The regular intercept seems to decay fast enough to not need this, but > we could change it if that turns out to be the case. > 6/6: > The rest of the intercept logic had issues, too. > See the commit. > > TODO: add some measurements of common workloads and some simple sanity > tests (like Vincent described in low utilization workloads if the > state selection looks reasonable). > I have some, but more (and more standardized) would be beneficial. > > Happy for anyone to take a look and test as well. > > Some numbers for context: > Maybe some numbers for context, I'll probably add them to the cover letter. > > Comparing: > - IO workload (intercept heavy). > - Timer workload very low utilization (check for deepest state) > - hackbench (high utilization) > all on RK3399 with CONFIG_HZ=100. > target_residencies: 1, 900, 2000 > > 1. IO workload, 5 runs, results sorted, in read IOPS. > fio --minimal --time_based --name=fiotest --filename=/dev/nvme0n1 --runtime=30 --rw=randread --bs=4k --ioengine=psync --iodepth=1 --direct=1 | cut -d \; -f 8; > > teo fixed: > /dev/nvme0n1 > [4597, 4673, 4727, 4741, 4756] > /dev/mmcblk2 > [5753, 5832, 5837, 5911, 5949] > /dev/mmcblk1 > [2059, 2062, 2070, 2071, 2080] > > teo mainline: > /dev/nvme0n1 > [3793, 3825, 3846, 3865, 3964] > /dev/mmcblk2 > [3831, 4110, 4154, 4203, 4228] > /dev/mmcblk1 > [1559, 1564, 1596, 1611, 1618] > > menu: > /dev/nvme0n1 > [2571, 2630, 2804, 2813, 2917] > /dev/mmcblk2 > [4181, 4260, 5062, 5260, 5329] > /dev/mmcblk1 > [1567, 1581, 1585, 1603, 1769] > > 2. Timer workload (through IO for my convenience ;) ) > Results in read IOPS, fio same as above. > echo "0 2097152 zero" | dmsetup create dm-zeros > echo "0 2097152 delay /dev/mapper/dm-zeros 0 50" | dmsetup create dm-slow > (Each IO is delayed by timer of 50ms, should be mostly in state2) > > teo fixed: > 3269 cpu_idle total > 48 cpu_idle_miss > 30 cpu_idle_miss above > 18 cpu_idle_miss below > > teo mainline: > 3221 cpu_idle total > 1269 cpu_idle_miss > 22 cpu_idle_miss above > 1247 cpu_idle_miss below > > menu: > 3433 cpu_idle total > 114 cpu_idle_miss > 61 cpu_idle_miss above > 53 cpu_idle_miss below > > Residencies: Hmm, maybe actually including them would've been helpful too: (Over 5s workload, only showing LITTLE cluster) teo fixed: idle_state 2.0 4.813378 -1.0 0.210820 1.0 0.202778 0.0 0.062426 teo mainline: idle_state 1.0 4.895766 -1.0 0.098063 0.0 0.253069 menu: idle_state 2.0 4.528356 -1.0 0.241486 1.0 0.345829 0.0 0.202505 > > tldr: overall teo fixed spends more time in state2 while having > fewer idle_miss than menu. > teo mainline was just way too aggressive at selecting shallow states. > > 3. Hackbench, 5 runs > for i in $(seq 0 4); do hackbench -l 100 -g 100 ; sleep 1; done > > teo fixed: > Time: 4.807 > Time: 4.856 > Time: 5.072 > Time: 4.934 > Time: 4.962 > > teo mainline: > Time: 4.945 > Time: 5.021 > Time: 4.927 > Time: 4.923 > Time: 5.137 > > menu: > Time: 4.991 > Time: 4.884 > Time: 4.880 > Time: 4.946 > Time: 4.980 > > tldr: all comparable, teo mainline slightly worse > > Kind Regards, > Christian > > Christian Loehle (6): > cpuidle: teo: Increase util-threshold > cpuidle: teo: Don't stop tick on utilized > cpuidle: teo: Don't always stop tick on one state > cpuidle: teo: Increase minimum time to stop tick > cpuidle: teo: Remove recent intercepts metric > cpuidle: teo: Don't count non-existent intercepts > > drivers/cpuidle/governors/teo.c | 121 +++++++++++++------------------- > 1 file changed, 48 insertions(+), 73 deletions(-) > > -- > 2.34.1 >