Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp891175rdg; Fri, 13 Oct 2023 04:35:14 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFnLFTwIW70ytiZGFV6tiOGdiPVeOxNMfLpZpC84VrBFbngPF8kPsFyvRx8p3v20ghDWYXV X-Received: by 2002:a9d:66cf:0:b0:6b9:c49f:1af7 with SMTP id t15-20020a9d66cf000000b006b9c49f1af7mr25256311otm.20.1697196913983; Fri, 13 Oct 2023 04:35:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697196913; cv=none; d=google.com; s=arc-20160816; b=V0dtRPds02olT6JLYGRfeBzLHUi7es5nkMbCT6JNPJBLtwhCC6ml4nLiuz3JYG29AE rm89C6GNe32I66qL4e/kO4y9+58e+2L9zKDRxhVrKKJEIjCDkqSDPINOsqj6aHAtOYv7 XNkoa5tTkKcT7ltwA6ycCj9JFFXlVMb4sjLUa2ciDQsYitmm9v8lnqbrlSjdFZOSUWn3 YkKCUm5WMxdYcBvTudo3mstod/tYrENRZ6JyFsGGaCV7ocJQioizmmOF7fpS9IHPUVy9 EeFVln5UgfWKkZXu9PUPQpgbvhQxBbQEJHTLTxuhfjYGmlchdnj8A5scgyruY6Q72LuN YxOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=w7AvSVtJ6JzuY2hjtX5BOR9p4Gskl+sVT7JAr1hDY68=; fh=6q0Z4tSGLiMAiN5i6vjWoMhfjmmSW28mbbjZPeA9zf0=; b=I9zLTs4XsPIpL/9dgVHPPhZkNHnmFDqT7MQX1fSmRD+VsTsHr+9US1BHjflYbw291/ DLtDy7ZK4bYfajlQRIHFAjrRDJom3EurMjA1/Hisxqa0Iaa6qaB173O3V2bcLWbvXYCU PGOdlU4npit9M8KZTiDXy9QPZ3o8+5ILQLyjqIW09J0wjk4ETpHIr+gAmMPIJCHShUGD mhyq4Ii4lti9ABHkJ/EjJxR7fCVVOK3Fa2ScUgnMnN9aqI+YcU1gw3hoYlTNgJhRI2aH JsDMTnvCLG/xdfjaopt6vditBMbKZ6nfRVM+ztXi60Av9lwC5gHiMWCtj+3Vh+86wtfc 5vmw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id g7-20020a056a0023c700b0068e2f6feab4si6304537pfc.374.2023.10.13.04.35.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Oct 2023 04:35:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 6C44881BDA5A; Fri, 13 Oct 2023 04:35:11 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230496AbjJMLfC (ORCPT + 99 others); Fri, 13 Oct 2023 07:35:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52096 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230150AbjJMLfB (ORCPT ); Fri, 13 Oct 2023 07:35:01 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A2D59B7 for ; Fri, 13 Oct 2023 04:34:59 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C3BFE11FB; Fri, 13 Oct 2023 04:35:39 -0700 (PDT) Received: from [10.57.80.116] (unknown [10.57.80.116]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9AADC3F7A6; Fri, 13 Oct 2023 04:34:56 -0700 (PDT) Message-ID: <1891aa6c-037f-46a1-9584-17aaa63e4e74@arm.com> Date: Fri, 13 Oct 2023 12:35:42 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v8 00/25] timer: Move from a push remote at enqueue to a pull at expiry model Content-Language: en-US To: Anna-Maria Behnsen Cc: Peter Zijlstra , linux-kernel@vger.kernel.org, John Stultz , Thomas Gleixner , Eric Dumazet , "Rafael J . Wysocki" , Arjan van de Ven , "Paul E . McKenney" , Frederic Weisbecker , Rik van Riel , Steven Rostedt , Sebastian Siewior , Giovanni Gherdovich , "Gautham R . Shenoy" , Srinivas Pandruvada , K Prateek Nayak References: <20231004123454.15691-1-anna-maria@linutronix.de> From: Lukasz Luba In-Reply-To: <20231004123454.15691-1-anna-maria@linutronix.de> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Fri, 13 Oct 2023 04:35:11 -0700 (PDT) Hi Anna-Maria On 10/4/23 13:34, Anna-Maria Behnsen wrote: > Hi, > [snip] > > > Testing > ~~~~~~~ > > Enqueue > ^^^^^^^ > > The impact of wasting cycles during enqueue by using the heuristic in > contrast to always queuing the timer on the local CPU was measured with a > micro benchmark. Therefore a timer is enqueued and dequeued in a loop with > 1000 repetitions on a isolated CPU. The time the loop takes is measured. A > quarter of the remaining CPUs was kept busy. This measurement was repeated > several times. With the patch queue the average duration was reduced by > approximately 25%. > > 145ns plain v6 > 109ns v6 with patch queue > > > Furthermore the impact of residence in deep idle states of an idle system > was investigated. The patch queue doesn't downgrade this behavior. > > dbench test > ^^^^^^^^^^^ > > A dbench test starting X pairs of client servers are used to create load on > the system. The measurable value is the throughput. The tests were executed > on a zen3 machine. The base is the tip tree branch timers/core which is > based on a v6.6-rc1. > > governor menu > > X pairs timers/core pull-model impact > ---------------------------------------------- > 1 353.19 (0.19) 353.45 (0.30) 0.07% > 2 700.10 (0.96) 687.00 (0.20) -1.87% > 4 1329.37 (0.63) 1282.91 (0.64) -3.49% > 8 2561.16 (1.28) 2493.56 (1.76) -2.64% > 16 4959.96 (0.80) 4914.59 (0.64) -0.91% > 32 9741.92 (3.44) 8979.83 (1.13) -7.82% > 64 16535.40 (2.84) 16388.47 (4.02) -0.89% > 128 22136.83 (2.42) 23174.50 (1.43) 4.69% > 256 39256.77 (4.48) 38994.00 (0.39) -0.67% > 512 36799.03 (1.83) 38091.10 (0.63) 3.51% > 1024 32903.03 (0.86) 35370.70 (0.89) 7.50% > > > governor teo > > X pairs timers/core pull-model impact > ---------------------------------------------- > 1 350.83 (1.27) 352.45 (0.96) 0.46% > 2 699.52 (0.85) 690.10 (0.54) -1.35% > 4 1339.53 (1.99) 1294.71 (2.71) -3.35% > 8 2574.10 (0.76) 2495.46 (1.97) -3.06% > 16 4898.50 (1.74) 4783.06 (1.64) -2.36% > 32 9115.50 (4.63) 9037.83 (1.58) -0.85% > 64 16663.90 (3.80) 16042.00 (1.72) -3.73% > 128 25044.93 (1.11) 23250.03 (1.08) -7.17% > 256 38059.53 (1.70) 39658.57 (2.98) 4.20% > 512 36369.30 (0.39) 38890.13 (0.36) 6.93% > 1024 33956.83 (1.14) 35514.83 (0.29) 4.59% > > > > Ping Pong Oberservation > ^^^^^^^^^^^^^^^^^^^^^^^ > > During testing on a mostly idle machine a ping pong game could be observed: > a process_timeout timer is expired remotely on a non idle CPU. Then the CPU > where the schedule_timeout() was executed to enqueue the timer comes out of > idle and restarts the timer using schedule_timeout() and goes back to idle > again. This is due to the fair scheduler which tries to keep the task on > the CPU which it previously executed on. > > I have tested this on my 2 Arm boards with mainline kernel and almost-mainline. On both platforms it looks stable. The results w/ your patchset looks better. 1. rockpi4b - mainline kernel (but no UI) Limiting the cpumask for only 4 Little CPUs and setting performance governor for cpufreq and menu for idle. 1.1. perf bench sched pipe w/o patchset vs. w/ patchset avg [ops/sec]: (more is better) 23012.33 vs. 23154.33 (+0.6%) avg [usecs/op]: (less is better) 43.453 vs. 43.187 (-0.6%) 1.2. perf bench sched messaging (less is better) w/o patchset vs. w/ patchset avg total time [s]: 2.7855 vs. 2.7005 (-3.1%) 2. pixel6 (kernel v5.18 with backported patchset) 2.1 Speedometer 2.0 (JS test running in Chrome browser) w/o patchset vs. w/ patchset 149 vs. 146 (-2%) 2.2 Geekbench 5 (more is better) Single core w/o patchset vs. w/ patchset 1025 vs. 1017 (-0.7%) Multi core w/o patchset vs. w/ patchset 2756 vs. 2813 (+2%) The performance looks good. Only one test 'Speedometer' has some interesting lower score. Fill free to add: Tested-by: Lukasz Luba Regards, Lukasz