Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp3241403ybc; Thu, 21 Nov 2019 05:35:03 -0800 (PST) X-Google-Smtp-Source: APXvYqwF+yBG0uszggPo2F5ixH5DLapAffNnO+8KEldU29YlNXv04EaYyc2tjCUy0jmDtpIwiwOX X-Received: by 2002:adf:f6cf:: with SMTP id y15mr10199208wrp.97.1574343303717; Thu, 21 Nov 2019 05:35:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1574343303; cv=none; d=google.com; s=arc-20160816; b=Wk7tf9vATD2b70zdfUL36XHsL5qy0C3LJXSpaRxrjQ72DZAIMIiKyYhD07ruA3ahKu cqd1a6CORL6xNRScLr3w7j3SBmUg2V7R/xs3/kAUkcrFT3Yj1LZ031ugEuE+4C9tMFyR W2Xtpxwb4wx+Y2iJqjfLBOODMd5l01godqJ0zT4ij+L/yBricOqQdhB8H3qGIpQ2jb7k onVSUi8Yj7WJ5owZX2U/VPY9S7t2AuzAIduE7XktUABygZ2fXQ8lknnzmjSvCxyJrkcz xy3yh4QLKPRcwkJzjVkCRNPOW2sYcOC3CWMg8Rdrn4ixKqEOjp4VrFqpeCo+Z9o4xnZP gMlw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=jfdDcg8Rb9EyvSu2eIxHgWj5A4eCWm0Rpkl8ERU2Ehs=; b=irkDJts6hWnlHI4TGASX2x9qE1pITH8ii7nbvIqEYSMy55fjBqm2iLELeJbnXVIBLX wu71L7Z6zULAxPXPXw0N6YQnDiaJ6jYagiZqBU0ZLnH/cgowrLIksVR3XZx/bOZz3oiS oTGOZWuJIkfIRybJMImFjvYB5JWAApjCJIKVuuouzKRWCaQKAvFVEYlG88xuUS4ufYz3 obcMO9rgCRKAKJ7vswTO0S8H1jP484LqKECzrATRHrbC96TPnUQZcveireekaJVklNM6 syTT+QXrp56FhE/Y1mT18q3AOJlG5qki1bgy9l2/cg/TZGo0atFHjDkz/8eJwFdL4KXs tWHQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=sk5rWxxK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id ce17si2235356edb.368.2019.11.21.05.34.38; Thu, 21 Nov 2019 05:35:03 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=sk5rWxxK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727362AbfKUN3q (ORCPT + 99 others); Thu, 21 Nov 2019 08:29:46 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:59802 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726634AbfKUN3p (ORCPT ); Thu, 21 Nov 2019 08:29:45 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=jfdDcg8Rb9EyvSu2eIxHgWj5A4eCWm0Rpkl8ERU2Ehs=; b=sk5rWxxKId6Hjdan020hfjeX1 UNhZmhPKm6rVbvQaCrkk7BX2MmOYRpT8hAw1F9TJbLpuDUTjba0XqdFrlBqdGCigUbwAbvlQrY8np OeYOFQmlp9OGquvo9OeJYV0YGRYEp0guO+rfcpB70MV5g8UQvIU9C5xpwth0HGIjSxR2xTf5y1+rG 4Tm4IjBYErkjeQjv3Za/e0HLyi2xBxZ1voeNZs1lfY+/lVHyvj+chZSnDtr9pBJdAk2cQ2AOPnF3H obukZ3/rB5w6h8glw8SNfm/zklTSpzamhSUJJddjG0v44lLG2Kfscj1T7F6cDC3JD5YZeX0OZA61D CmVk4S5Og==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1iXmWh-0008QU-Dz; Thu, 21 Nov 2019 13:29:39 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id CECCA3056C8; Thu, 21 Nov 2019 14:28:26 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id B1981201DD6AF; Thu, 21 Nov 2019 14:29:37 +0100 (CET) Date: Thu, 21 Nov 2019 14:29:37 +0100 From: Peter Zijlstra To: Phil Auld Cc: Dave Chinner , Ming Lei , linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, Jeff Moyer , Dave Chinner , Eric Sandeen , Christoph Hellwig , Jens Axboe , Ingo Molnar , Tejun Heo , Vincent Guittot Subject: Re: single aio thread is migrated crazily by scheduler Message-ID: <20191121132937.GW4114@hirez.programming.kicks-ass.net> References: <20191114113153.GB4213@ming.t460p> <20191114235415.GL4614@dread.disaster.area> <20191115010824.GC4847@ming.t460p> <20191115045634.GN4614@dread.disaster.area> <20191115070843.GA24246@ming.t460p> <20191115234005.GO4614@dread.disaster.area> <20191118092121.GV4131@hirez.programming.kicks-ass.net> <20191118204054.GV4614@dread.disaster.area> <20191120191636.GI4097@hirez.programming.kicks-ass.net> <20191120220313.GC18056@pauld.bos.csb> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191120220313.GC18056@pauld.bos.csb> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 20, 2019 at 05:03:13PM -0500, Phil Auld wrote: > On Wed, Nov 20, 2019 at 08:16:36PM +0100 Peter Zijlstra wrote: > > On Tue, Nov 19, 2019 at 07:40:54AM +1100, Dave Chinner wrote: > > > Yes, that's precisely the problem - work is queued, by default, on a > > > specific CPU and it will wait for a kworker that is pinned to that > > > > I'm thinking the problem is that it doesn't wait. If it went and waited > > for it, active balance wouldn't be needed, that only works on active > > tasks. > > Since this is AIO I wonder if it should queue_work on a nearby cpu by > default instead of unbound. The thing seems to be that 'unbound' is in fact 'bound'. Maybe we should fix that. If the load-balancer were allowed to move the kworker around when it didn't get time to run, that would probably be a better solution. Picking another 'bound' cpu by random might create the same sort of problems in more complicated scenarios. TJ, ISTR there used to be actually unbound kworkers, what happened to those? or am I misremembering things. > > Lastly, > > one other thing to try is -next. Vincent reworked the load-balancer > > quite a bit. > > > > I've tried it with the lb patch series. I get basically the same results. > With the high granularity settings I get 3700 migrations for the 30 > second run at 4k. Of those about 3200 are active balance on stock 5.4-rc7. > With the lb patches it's 3500 and 3000, a slight drop. Thanks for testing that. I didn't expect miracles, but it is good to verify. > Using the default granularity settings 50 and 22 for stock and 250 and 25. > So a few more total migrations with the lb patches but about the same active. Right, so the granularity thing interacts with the load-balance period. By pushing it up, as some people appear to do, makes it so that what might be a temporal imablance is perceived as a persitent imbalance. Tying the load-balance period to the gramularity is something we could consider, but then I'm sure, we'll get other people complaining the doesn't balance quick enough anymore.