Received: by 2002:ab2:4a89:0:b0:1f4:a8b6:6e69 with SMTP id w9csp318580lqj; Wed, 10 Apr 2024 11:16:43 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXyYmVo46TbxD/fO6GhO/yVKr66AoIt3tOHhpt5bGK9+DRxkv9UkIL3WuO18Q17tyUnd6qRteE9LEM8X4WcGpgmTBuGeSFYeWrh8d+zEw== X-Google-Smtp-Source: AGHT+IE8SuHBT4VjSq+9qAYMaHW3KWL58blIGVQ+PdzTLUmSE3qbP8i778OpEk3tgUBq9zsaoz4V X-Received: by 2002:a17:906:2413:b0:a52:194f:72da with SMTP id z19-20020a170906241300b00a52194f72damr463200eja.51.1712773003527; Wed, 10 Apr 2024 11:16:43 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712773003; cv=pass; d=google.com; s=arc-20160816; b=cX8ZLtTdQAIqy9NN2EMgXLUdv3OXSK0M/Py+0ir2i9OB95BWd3JMkVqpl+FCcbgCrR jTpZVyCvgUphnFUfIfNvLLSf/7k9rKJq3uV9kuLhZ0BehIxMLmVp0I3sINfkXdkUBDUY Wtm2skU+BE1dRYfeUVu/gPgAXbka1nyzq4KA93jmt3t6assuQPOu4pnfBmt0UJh8bJSV iYrksfSnrQPKLzsPR8eZEMeNkOnZq6J6sBkY460VxI7CtSwMtvdGlwfA67EvjDEdPHhY tNjzorrfEZWTcFZiE1wE7qggvoOYYkW5z79m9gCypcxuteDZOLvLH9I6tVBiYTtmu1Jf YtBA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :references:message-id:subject:cc:to:from:date:dkim-signature; bh=UCu3vAa+tXcoFNa1tysLxOJ5zWMnnTVGFp6Jd0UonFc=; fh=OONHIXOf8VTn3K8+p3XVDETEVgOGd3zpnKvpIhrIniE=; b=bPZj7gHckSDmRD0sXBRpSs1diRFU0U/1dUYkVHf03R2ymOD9AJw0jBGKIgZwO9tLar dWUijpx9O9JRgBAoRjfXfnsJ5m9Bu7CXm0bTWXvvGRbeQZLr0ReUfMtJBWWs+djY2xCP Mvq3FO3UMY/Aa6rL/ZvjsCOEoL8ThzzKAEEJJwFDHRc0bBOCWK4yPgHP5jgRKqV9Z18c FIW2hRQz3UUaHwhGY0xQJDLC4a//k4qCJVZLKxvp/OXkd/9YK5V9PFa3AUBzf7JEi/7w XGH19xAh7VBU6KxEtHYXRbCqhzOWvJH9XoVExKY1kxk+P2Khkpz6F4v3moQxMtvHfI1M vsUQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@layalina-io.20230601.gappssmtp.com header.s=20230601 header.b="K2t/6UA1"; arc=pass (i=1 spf=pass spfdomain=layalina.io dkim=pass dkdomain=layalina-io.20230601.gappssmtp.com); spf=pass (google.com: domain of linux-kernel+bounces-139180-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-139180-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id di13-20020a170906730d00b00a4e0cdc0e1fsi6063148ejc.142.2024.04.10.11.16.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Apr 2024 11:16:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-139180-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@layalina-io.20230601.gappssmtp.com header.s=20230601 header.b="K2t/6UA1"; arc=pass (i=1 spf=pass spfdomain=layalina.io dkim=pass dkdomain=layalina-io.20230601.gappssmtp.com); spf=pass (google.com: domain of linux-kernel+bounces-139180-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-139180-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 202C41F2291C for ; Wed, 10 Apr 2024 18:16:43 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id F21F9181334; Wed, 10 Apr 2024 18:15:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b="K2t/6UA1" Received: from mail-pf1-f174.google.com (mail-pf1-f174.google.com [209.85.210.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B1BF180A97 for ; Wed, 10 Apr 2024 18:15:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.174 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712772942; cv=none; b=qb1JSLYdkLTynDyMEWYbm6xSXWM32xpWll3rqScoVOap6l4K01y1i6JW9K3TiqwPAJ3a7db2twGyUb4Rq/vxYYcQik7aX+jAhVwu+Vf2LSLHBonJ3o2Sg4zwzWDPlNzRjJTFnaENG//G0B7CtCrq/SpH7cOhHDRgW89Bjh+zsFE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712772942; c=relaxed/simple; bh=I7w75JKCRyKbAmEW3pStOguyIrdpPPhkqLTig6UY8/8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=rj1YRYbTG4XpKpDfgDCndgMeB6vJ0Bi/qb6dc0ELdjYa0OWO5LJIIIQRdrKvYz7kcRjH7puKEo7bad++RBqlhNuiWQs1b26ZgGc6j+RX4MgmxQb4mE9GLrneiM0rNdT1YDuudK4yKL1He5AIEtKF+lp11/NOCJec5U71W0ijTBk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io; spf=pass smtp.mailfrom=layalina.io; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b=K2t/6UA1; arc=none smtp.client-ip=209.85.210.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=layalina.io Received: by mail-pf1-f174.google.com with SMTP id d2e1a72fcca58-6eaf9565e6bso4960257b3a.2 for ; Wed, 10 Apr 2024 11:15:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1712772939; x=1713377739; darn=vger.kernel.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=UCu3vAa+tXcoFNa1tysLxOJ5zWMnnTVGFp6Jd0UonFc=; b=K2t/6UA10SQIPe3bYIl6dzI/OdAncXmidrYgpsEJf8x9WH/bdYlVLh2/dutIuNUB3I 8ofirz+72by3iYq3Ffn3qkRT+gIR1+oxNvHnGxXG5G3Fi+/Myyx3MEfIybsHCfoapa/l /uS2olH/gKal27COqaLPjl9jQ84AZjZWM0EHeZZWnyx3Gnl1BRIaE5SWKRrewbPV15S4 1F8/FAb6STLt8seijc2/oWS5KwTgtabDUde7559Bsx/hAV7BtIvXNeBz8jeUkdp27U2I hDgKgkJefZj7QoIgUX5yrkQw0kPLrU8bKNwSMAin3URbfrKgNquPeNEIQzPNlMIRCbiF 4NQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712772939; x=1713377739; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=UCu3vAa+tXcoFNa1tysLxOJ5zWMnnTVGFp6Jd0UonFc=; b=Z4C2CsFCxnRQ7CRFz/kWf+NOcMfFaxU+MLfVd1FmoBeazpX5ckvapJIaXe6ObFO2tR Ene7vL71sbz4JWne3sR57Lr+EnkoTHSOmIsuLCpsacMq5Q+7nuXnjN119/uzRUAPgPGp Dcr2XVRd/WBEcqr2YVbYroTY94bTmnQkl7sgFHtDxZQYV7+15GdIgX9lfs10v36GQ5Fx 9229jT/ZVJsUBdn65exnKgZp1lfdFtdwcsrjEMNGGoEwR2zkoeahgjeGTaFqjdw3MerA tfjsQMSfstcHfx4BiWoPkl1+pFsjlSJKQI98VFU+AUE/BTdsi/XSJMTzaYnYCIc0lWCa yf9Q== X-Forwarded-Encrypted: i=1; AJvYcCXSvIOecrw0asMgA5UPxxTMOBvUzwvhEALRezAb2PtrfRFlA9r95XVf3v1z7a8fT/lj4f1y9FJFdW+8Y9U9yalkNa8Urk9QyViWdisM X-Gm-Message-State: AOJu0Yw/RSp9k6htdHt23flFzvjJH6XhzQv4DDXiQP+zuZ7Rxk3BKtQ9 cJ+SXDN3xevuzW5cF8lp8EKN8CL+j6x4F0mJQOuRyuFW/oHdVJQj6jXkTjDaGo0= X-Received: by 2002:a05:6a20:4391:b0:1a1:878d:d3f6 with SMTP id i17-20020a056a20439100b001a1878dd3f6mr4316748pzl.26.1712772939458; Wed, 10 Apr 2024 11:15:39 -0700 (PDT) Received: from airbuntu ([104.132.0.101]) by smtp.gmail.com with ESMTPSA id k3-20020aa79d03000000b006ecca2f2a32sm10322177pfp.168.2024.04.10.11.15.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Apr 2024 11:15:39 -0700 (PDT) Date: Wed, 10 Apr 2024 19:15:37 +0100 From: Qais Yousef To: John Stultz Cc: Vincent Guittot , Ingo Molnar , Peter Zijlstra , Juri Lelli , Steven Rostedt , Daniel Bristot de Oliveira , Thomas Gleixner , "Paul E. McKenney" , Joel Fernandes , Dietmar Eggemann , linux-kernel@vger.kernel.org, Yabin Cui Subject: Re: [PATCH] sched/pi: Reweight fair_policy() tasks when inheriting prio Message-ID: <20240410181537.fqpix44uo43jvwct@airbuntu> References: <20240404220500.dmfl2krll37znbi5@airbuntu> <20240405171653.boxbylrdak5fopjv@airbuntu> <20240407122700.ns7gknqwqkpjjyd4@airbuntu> <20240409061909.tb3vxc27h2eawiwg@airbuntu> <20240410065901.ruzhjsmtmpsnl4qe@airbuntu> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On 04/10/24 10:30, John Stultz wrote: > On Tue, Apr 9, 2024 at 11:59 PM Qais Yousef wrote: > > > > On 04/09/24 14:35, Vincent Guittot wrote: > > > On Tue, 9 Apr 2024 at 08:19, Qais Yousef wrote: > > > > > > > > On 04/08/24 12:51, John Stultz wrote: > > > > > On Mon, Apr 8, 2024 at 12:17 AM Vincent Guittot > > > > > wrote: > > > > > > > > > > > > On Sun, 7 Apr 2024 at 14:27, Qais Yousef wrote: > > > > > > > > > > > > > > On 04/05/24 18:16, Qais Yousef wrote: > > > > > > > > > > > > > > > > > > > > > > > > > All that to say that I think the weight is not applied on purpose. > > > > > > > > > This might work for your particular case but there are more changes to > > > > > > > > > be done if you want to apply prio inheritance between cfs tasks. > > > > > > > > > > > > > > > > > > As an example, what about the impact of cgroup on the actual weight > > > > > > > > > and the inherited priority of a task ? If the owner and the waiter > > > > > > > > > don't belong to the same cgroup their own prio is meaningless... task > > > > > > > > > nice -20 in a group with a weight equal to nice 19 vs a task nice 19 > > > > > > > > > in a group with a weight equals to nice -20 > > > > > > > > > > > > > > > > That is on my mind actually. But I thought it's a separate problem. That has to > > > > > > > > do with how we calculate the effective priority of the pi_task. And probably > > > > > > > > the sorting order to if we agree we need to revert the above. If that is done > > > > > > > > > > > > > > Thinking more about it the revert is not the right thing to do. We want fair > > > > > > > tasks to stay ordered in FIFO for better fairness and avoid potential > > > > > > > starvation issues. It's just the logic for searching the top_waiter need to be > > > > > > > different. If the top_waiter is fair, then we need to traverse the tree to find > > > > > > > the highest nice value. We probably can keep track of this while adding items > > > > > > > to the tree to avoid the search. > > > > > > > > > > > > > > For cgroup; is it reasonable (loosely speaking) to keep track of pi_cfs_rq and > > > > > > > detach_attach_task_cfs_rq() before the reweight? This seems the most > > > > > > > straightforward solution and will contain the complexity to keeping track of > > > > > > > cfs_rq. But it'll have similar issue to proxy execution where a task that > > > > > > > doesn't belong to the cgroup will consume its share.. > > > > > > > > > > > > That's a good point, Would proxy execution be the simplest way to fix all this ? > > > > > > > > Is it? Over 4.5 years ago Unity reported to me about performance inversion > > > > problem and that's when proxy execution work was revived as simplest way to fix > > > > all of this. But still no end in sight from what I see. I was and still think > > > > an interim solution in rt_mutex could help a lot of use cases already without > > > > being too complex. Not as elegant and comprehensive like proxy execution, but > > > > given the impact on both userspace and out of tree kernel hacks are growing > > > > waiting for this to be ready, the cost of waiting is high IMHO. > > > > > > > > FWIW, I already heard several feedbacks that PTHREAD_PRIO_INHERIT does nothing. > > > > I think this reweight issue is more serious problem and likely why I heard this > > > > feedback. I could be underestimating the complexity of the fix though. So I'll > > > > > > Without cgroup, the solution could be straightforward but android uses > > > extensively cgroup AFAICT and update_cfs_group() makes impossible to > > > track the top cfs waiter and its "prio" > > > > :( > > > > IIUC the issue is that we can't easily come up with a single number of > > 'effective prio' for N level hierarchy and compare it with another M level > > hierarchy.. > > > > Does proxy execution fix this problem then? If we can't find the top waiter, > > I can't see how proxy execution would work here too. To my understanding it's > > more about how we apply inheritance (by donating execution context of the top > > waiter) instead of manually applying inheritance like we're doing now. > > So, while proxy provides a sort of generalized inheritance, it isn't > deep enough in the class scheduler logic to need to really think about > priority/cgroups. > > It just looks at what gets selected to run. That's the most important > task at that moment. It doesn't really need to care about how/why, > that's left to pick_next_task(). > > Since it leaves mutex blocked tasks on the RQ, it allows the class > scheduler logic to pick the most important task (mutex-blocked or not) > to run. Then if a mutex-blocked task gets selected, we will then find > the mutex owner and run it instead so it can release the lock. When > locks are released, if the owner has a "donor" task, the lock is > handed off to the donor. So, this basically uses the > pick_next_task()'s evaluation of what it wanted to run to effectively > provide the "top waiter". Thanks John. So there's no top waiter and all tasks are left runnable, makes sense.