Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp138869pxk; Thu, 24 Sep 2020 01:29:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw/t0cqfI9qbdAKh7Z2D1EZI4dEw9SwfCpu3Vyeo7Hj6x6zaqi3Y+I/ghi3s+l8HRTUidHZ X-Received: by 2002:a17:906:2714:: with SMTP id z20mr1743298ejc.409.1600936190542; Thu, 24 Sep 2020 01:29:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600936190; cv=none; d=google.com; s=arc-20160816; b=U71ez6yfjxqBHoEFXn9PeKWCyo4OwRC4ezoiC2AEL6IvF6uBknM9OBNhBIVWt77ihr SQZMzNZT/6wIVKzG79YR6sWE64g8A1vv+2MEe5VNVRdpiXm612ioWki3xJO239R9kHaz opH+chNMAvVG7YScuy62uLp5/MMmS+AplxzP2o/Wt20MphNTPziOAwhrY4Z8MMrVuYAv ql06gn9x/BM651LikgsNpOVdcbwQVLpOSdB67SCsiOFoDn67J6ZuiATolA4e6M/cpXoW ttM1ABRm3uHq8VhyC4dM1X2z3bXJ6tDo/GimYfakK/cV7BebLaB1FaXNSCMzFWN1dgQX 22OA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=y2iJgnscDndNOOEI7TjT3t+4iiMWsDCq+S6UjvfrIic=; b=r02prARJpGFuXpetm7nrPV8tQE/rqKPK+QUmRWLsqvGN7xgFiv5mixw2FpoUH6RYSf HPeVjhVUXhaKhVMs2TDPU6BwcuspEGxUtUHIFwy5pZ1AQaYQErRk/cWEvPiW2bc/9hNZ ImbCMRXOPIBDKyTVbHkqcuuTtuwnSA1Se3FUuRlctrEvO9PTg5jnCxcQmn0/Zfq0HlW+ SNc31qjT44XaOTHKUDkNSijer2YegKoNLOLYc6P/DXPj8ROFD8emGHTvoukv9pYGSvlx B2RED2AQS2uMa81vAznJmhhpm7zlGC8T5eYU3iT7zS/9A4yBQNCWwULBlCx4D/YtIo/Q OQLg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=merlin.20170209 header.b=hyyxtagO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d25si1951746edn.340.2020.09.24.01.29.27; Thu, 24 Sep 2020 01:29:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=merlin.20170209 header.b=hyyxtagO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727229AbgIXI2W (ORCPT + 99 others); Thu, 24 Sep 2020 04:28:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58430 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726849AbgIXI2V (ORCPT ); Thu, 24 Sep 2020 04:28:21 -0400 Received: from merlin.infradead.org (merlin.infradead.org [IPv6:2001:8b0:10b:1231::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 98BA1C0613CE; Thu, 24 Sep 2020 01:28:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=y2iJgnscDndNOOEI7TjT3t+4iiMWsDCq+S6UjvfrIic=; b=hyyxtagO7Po6QNPknYfGECqZwu /x1kle7ejuwJCG51p+rr1z3q34nZ5cJkbU8M1Cwiy+m+9qIX51Ellrl1eJ03yjphwsH//6rWYXAQe BWcw0C8olnQN+uVzdhwIT6hjz9OQ/8BX9A1ZD25FQ/va/P7pKYEE9i3Ky/12CxwKV1ulGk7bPUNsu X9YprgZC5nz4SROXw02x6E3whJSj8Ec8Qyh7rvXWGB+Mvp9eXt89N+YBovVg1ttK4BBXITy6zQD2u DcXvKKwYeBbAd5o/iAubFSmIoKtH94PzABVhzf98FBCIx8sye/7TMPCfMDgDz3i3oWQ+DrM8yPt3g QqrUzTHA==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1kLMb4-0003jy-6l; Thu, 24 Sep 2020 08:27:22 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id CD73C3059DE; Thu, 24 Sep 2020 10:27:17 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id ACF8A2BC141E8; Thu, 24 Sep 2020 10:27:17 +0200 (CEST) Date: Thu, 24 Sep 2020 10:27:17 +0200 From: peterz@infradead.org To: Steven Rostedt Cc: Thomas Gleixner , Linus Torvalds , LKML , linux-arch , Paul McKenney , the arch/x86 maintainers , Sebastian Andrzej Siewior , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Andrew Morton , Linux-MM , Russell King , Linux ARM , Chris Zankel , Max Filippov , linux-xtensa@linux-xtensa.org, Jani Nikula , Joonas Lahtinen , Rodrigo Vivi , David Airlie , Daniel Vetter , intel-gfx , dri-devel , Ard Biesheuvel , Herbert Xu , Vineet Gupta , "open list\:SYNOPSYS ARC ARCHITECTURE" , Arnd Bergmann , Guo Ren , linux-csky@vger.kernel.org, Michal Simek , Thomas Bogendoerfer , linux-mips@vger.kernel.org, Nick Hu , Greentime Hu , Vincent Chen , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , linuxppc-dev , "David S. Miller" , linux-sparc Subject: Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends Message-ID: <20200924082717.GA1362448@hirez.programming.kicks-ass.net> References: <87mu1lc5mp.fsf@nanos.tec.linutronix.de> <87k0wode9a.fsf@nanos.tec.linutronix.de> <87eemwcpnq.fsf@nanos.tec.linutronix.de> <87a6xjd1dw.fsf@nanos.tec.linutronix.de> <87sgbbaq0y.fsf@nanos.tec.linutronix.de> <20200923084032.GU1362448@hirez.programming.kicks-ass.net> <20200923115251.7cc63a7e@oasis.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200923115251.7cc63a7e@oasis.local.home> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 23, 2020 at 11:52:51AM -0400, Steven Rostedt wrote: > On Wed, 23 Sep 2020 10:40:32 +0200 > peterz@infradead.org wrote: > > > However, with migrate_disable() we can have each task preempted in a > > migrate_disable() region, worse we can stack them all on the _same_ CPU > > (super ridiculous odds, sure). And then we end up only able to run one > > task, with the rest of the CPUs picking their nose. > > What if we just made migrate_disable() a local_lock() available for !RT? Can't, neiter migrate_disable() nor migrate_enable() are allowed to block -- which is what makes their implementation so 'interesting'. > This should lower the SHC in theory, if you can't have stacked migrate > disables on the same CPU. See this email in that other thread (you're on Cc too IIRC): https://lkml.kernel.org/r/20200923170809.GY1362448@hirez.programming.kicks-ass.net I think that is we 'frob' the balance PULL, we'll end up with something similar. Whichever way around we turn this thing, the migrate_disable() runtime (we'll have to add a tracer for that), will be an interference term on the lower priority task, exactly like preempt_disable() would be. We'll just not exclude a higher priority task from running. AFAICT; the best case is a single migrate_disable() nesting, where a higher priority task preempts in a migrate_disable() section -- this is per design. When this preempted task becomes elegible to run under the ideal model (IOW it becomes one of the M highest priority tasks), it might still have to wait for the preemptee's migrate_disable() section to complete. Thereby suffering interference in the exact duration of migrate_disable() section. Per this argument, the change from preempt_disable() to migrate_disable() gets us: - higher priority tasks gain reduced wake-up latency - lower priority tasks are unchanged and are subject to the exact same interference term as if the higher priority task were using preempt_disable(). Since we've already established this term is unbounded, any task but the highest priority task is basically buggered. TL;DR, if we get balancing fixed and achieve (near) the optimal case above, migrate_disable() is an over-all win. But it's provably non-deterministic as long as the migrate_disable() sections are non-deterministic. The reason this all mostly works in practise is (I think) because: - People care most about the higher prio RT tasks and craft them to mostly avoid the migrate_disable() infected code. - The preemption scenario is most pronounced at higher utilization scenarios, and I suspect this is fairly rare to begin with. - And while many of these migrate_disable() regions are unbound in theory, in practise they're often fairly reasonable. So my current todo list is: - Change RT PULL - Change DL PULL - Add migrate_disable() tracer; exactly like preempt/irqoff, except measuring task-runtime instead of cpu-time. - Add a mode that measures actual interference. - Add a traceevent to detect preemption in migrate_disable(). And then I suppose I should twist Daniel's arm to update his model to include these scenarios and numbers.